Keywords

1 Introduction

Methylation occurs on arginine and lysine residues in histones and is involved in a wide range of biological processes such as gene expression, chromatin organization, dosage compensation, and epigenetic memory. Unlike acetylation, where positive charges on histones are removed, relaxing chromatin and activating genes, methylation or its removal does not affect charges on histones. Histone methylation can be transcriptionally repressive or activating, depending on the position of the methylated residue and the extent of methylation. Histone lysine residues can be monomethylated, dimethylated, or trimethylated [1], while arginine residues can be monomethylated or dimethylated symmetrically or asymmetrically [2]. The methylation of lysines 4, 36, or 79 of histone H3 is typically associated with active transcription, while the methylation of lysines 9 or 27 on histone H3 and lysine 20 on histone H4 contributes to repressed transcription. Methylation with the overall combination of other histone “marks,” such as acetylation, phosphorylation, and ubiquitination, as well as the presence of other regulatory factors and DNA methylation ultimately determine chromatin conformation and expression level of the associated gene.

In order to change chromatin structure and gene expression level, histone modifications have to be altered, which can include a reversal of methylation i.e. demethylation. Unlike phosphorylation and acetylation, histone methylation was long regarded as irreversible partially because of the stable nature of the carbon-nitrogen bond and lack of evidence for its role in dynamic regulation of gene expression [3]. In addition, in a number of early studies, the half-life of histones and methyl-lysine residues within them appeared to be the same, implying histone methylation was not reversible [4, 5]. However, other studies seemed to show that active turnover of methyl groups did occur at a low but detectable level [6, 7]. Furthermore, in many documented cases, it appeared that different histone methylation patterns were necessary for regulation of gene expression [8].

As early as the 1960s, protein extracts containing lysine demethylase activity were identified and partially purified [9,10,11]. Decades later, in 2004, as evidence was mounting that histone methylation was dynamic and reversible, the first histone demethylase, a flavin-containing amine oxidase, was finally identified. Lysine-Specific Demethylase 1 (LSD1, also known as KDM1A) provided the first experimental evidence for enzymatic histone demethylation [12]. It was shown to mediate oxidative demethylation on monomethylated or dimethylated H3K4, but not trimethylated H3K4 because the enzymatic mechanism requires a protonated nitrogen [13]. Subsequently, a second and larger class of demethylases containing what was called a Jumonji C (JmjC) domain was theorized [14] and then independently identified through biochemical purification in 2005 [15]. Within a year, more JmjC demethylases were discovered in rapid succession [16,17,18,19] and even a crystal structure of a JmjC catalytic domain was published [20]. Unlike KDM1A, the JmjC proteins do not require a protonated nitrogen allowing some JmjC family members to act on trimethylated H3K4 as well [13].

Arginine methylation is also a very stable mark, but unlike lysine methylation, it is still unclear if this modification can be enzymatically reversed. Arginine residues can be monomethylated or asymmetrically or symmetrically dimethylated on R3, R8, R17, and R26 on H3, and R3 on both histones H2A and H4 [21]. A putative arginine demethylase JMJD6, a JmjC domain-containing protein, was reported to demethylate both asymmetric and symmetric H3R2me2 and H4R3me2 substrates [22]. However, others have not been able to replicate this observation and have found JMJD6 to be a lysine-hydroxylase, catalyzing C-5 hydroxylation of lysine residues in mRNA splicing-regulatory proteins and in histones with no demethylase activity on either H3R2me2 or H4R3me2 peptides [23,24,25]. Additionally, mutational and structural analysis of JMJD6 suggests that it is not an arginine demethylase and has a novel substrate binding groove and two positively charged surfaces with a stack of aromatic residues located near the active center not found in any JmjC oxygenase family member and may even interact with ssRNA [26, 27]. While it appears JMJD6 may play some role in regulating gene expression, there is still no clear consensus what that role is [25]. Interestingly, it has very recently reported that some JmjC demethylases are also able to demethylate histone and non-histone arginine residues in vitro albeit not as efficiently as histone lysine residues; a crystal structure of a known JmjC lysine demethylase (KDM4A) with an H4R3me2s containing-peptide (PDB code 5FWE) was also determined [28].

Arginine residues are not only substrates for methylation, but can also be converted to citrulline by deimination through the action of peptidylarginine deiminases [29]. Deimination has been suggested as a path for arginine demethylation but it appears that these deiminases act only on unmethylated arginines, and not on methylated arginine [30, 31]. As no enzyme has been found that converts citrulline back to arginine, methylation of arginine can only be antagonized by this modification [32].

Thus, since there are no verifiable in vivo arginine demethylases at this time, this chapter will only discuss histone lysine (K) demethylases (KDMs). Over the last decade, over 20 KDMs have been discovered representing two distinct classes: the KDM1 family containing two lysine-specific demethylase (LSD) enzymes, and the KDM2–7 families consisting of the JmjC domain-containing enzymes (Table 1). Each KDM demethylates only certain methylated histone residues and sometimes only certain methylated states (mono-, di-, tri) of those residues. The demethylases that primarily act on methylated H3K4 (KDM1 and KDM5) belong to the two different enzyme families; those that primarily act on other histone methylated lysines belong only to the second JmjC family.

Table 1 Lysine demethylase families and their histone substrates

The substrate-binding specificities of KDMs are quite diverse. The most prevalent histone lysine substrates are H3K4, H3K9, H3K27, H3K36, H4K20, and H1.4K26. In part, substrate specificity of each demethylase depends on the histone peptide sequence surrounding target lysine residues. However, another important component of KDM specificity is mediated by combinations of additional conserved “helper” domains within KDMs which can include combinations of such “helpers” as the plant homeodomain (PHD) [33], Tudor [34, 35], zinc fingers (such as zf-CXXC and zf-C2HC4) [36,37,38], F-box [39], AT-rich interactive domain (ARID) [40], tetratricopeptide repeat (TPR) [41] and leucine-rich region (LRR) domains [42, 43]. In addition, selectivity can be conferred by the composition and character of “linker” sequences between the catalytic and neighboring helper domain [44]. Furthermore, KDMs are part of large multimeric and dynamic complexes that contribute yet another level of specificity for gene localization and histone targeting. Finally, alternative splicing of mRNA of JmjC demethylases can lead to different isoforms. These isoforms could have different specificities and/or form different protein complexes.

Each of these topics must be discussed to understand our limited knowledge of the molecular basis of histone demethylation. Aberrant histone methylation caused by mutation or misregulation of histone demethylases and histone methyltransferases has been observed in several human diseases, particularly cancer. Modulation of histone methylation status for aberrant gene expression in cancers offers medicinal potential for the treatment of cancers via the development of molecular inhibitors of histone demethylases.

2 KDM1 Family Architecture and Mechanism

Unlike other KDMs, the KDM1 family utilizes the cofactor flavin adenine dinucleotide (FAD) to demethylate the methylated lysine substrate via a redox reaction (Fig. 1). The catalytic domain of the KDM1 family, the amine oxidase-like (AOD) domain, is related to the large superfamily of flavin-dependent monoamine oxidases, with MAO-A and MAO-B being the closest homologues [12]. Like other members of this superfamily, the AOD of the two KDM1 family members can be further subdivided into two separate subdomains, with one subdomain involved in substrate binding and the other forming an expanded Rossmann fold used to bind the cofactor FAD. Each of these subdomains is formed from sequence components spread throughout the primary structure. The substrate-binding subdomain is composed of a six-stranded mixed β-sheet flanked by six α-helices. The two subdomains create a big cavity that defines the demethylase catalytic center at their interface [20, 45,46,47].

Fig. 1
figure 1

Reaction mechanisms of (a) KDM1/LSD1 demethylase family and (b) JmjC demethylases

The FAD cofactor, binding to the KDM1A protein, undergoes two-electron reduction by substrate oxidation (Fig. 1a). The oxidized form of FAD is restored by molecular oxygen to generate hydrogen peroxide. The coupled oxidation of methyl-lysine forms a hydrolytically labile imine and FADH2. Molecular oxygen is used as the electron acceptor, and methyl group oxidation is then followed via hydride transfer from the N-methyl group onto FAD, forming an imine. This imine intermediate is unstable and further hydrolyzed non-enzymatically to release the demethylated lysine and formaldehyde. Thus, during catalysis, each cycle of methyl removal produces a molecule of formaldehyde and of H2O2, while consuming an O2. KDM1/LSDs are incapable of demethylating trimethyl-lysine residues, because the quaternary ammonium group cannot form the requisite imine intermediate.

3 JmjC KDM Architecture and Reaction Mechanism

The JmjC KDM family belongs to a larger superfamily of oxygenases which utilize 2-oxoglutarate (2OG) (also referred to as α-ketoglutarate or α-KG) as a co-substrate and Fe(II) as a cofactor, and couples substrate oxidation to the decarboxylation of 2-oxoglutarate to produce succinate and CO2 (Fig. 1b). These enzymes are conserved in eukaryotes from yeast to humans [15], and have a double-stranded β-helical (DSBH) or “jelly-roll” fold consisting of eight antiparallel β-strands that form a β-sandwich structure comprised of two four-stranded antiparallel β-sheets. This structure is often referred to as the Jmj or JmjC domain.

The distorted/squashed barrel-like structure is open at one end where the octahedrally coordinated catalytic iron resides in a funnel-shaped active site with three interactions provided by a conserved His-X-Asp/Glu-XN-His triad from the protein. The co-substrate 2OG is in a compact binding site where its 2-oxo carboxylate group also bidentately binds the iron; the iron ion also has a water molecule at the sixth position where a catalytic oxygen species is expected to reside during the demethylation reaction. The other end of 2OG interacts with the side-chain of a basic residue (Arg/Lys) and with a hydroxyl group from a Ser/Thr or Tyr residue [20, 48].

The JmjC KDM reaction begins by generating a superoxide radical by the complex Fe2+/2OG, which reacts with the C2 atom of 2OG, leading to its decarboxylation to succinate and formation of a Fe4+-oxo species. Afterwards, the highly reactive Fe4+-oxo species abstracts a hydrogen from a lysine ζ-methyl group as the iron is reduced to Fe3+, forming an unstable carbinolamine that will rapidly break down, leading to the release of formaldehyde and loss of a methyl group from the methylated lysine residue. Unlike the KDM1 family, JmjC KDMs do not require lone pair electrons on the target nitrogen atom and thereby can demethylate trimethylated as well as di- and monomethylated lysines.

JmjC demethylases bind methyl-lysine in a highly conserved pocket in the active site through the formation of a network of C–H•••O hydrogen bonds between the methyl groups and the oxygen atoms from the backbone and side chain of active-site residues. Many crystallographic studies have revealed that substrate binding often involves residues from the first and second β-strand of the DSBH, together with strands and loops that extend the JmjC domain. In contrast, “reader” domains such as PHD bind methyl-lysine through cation–π interactions between the methylammonium ion and a cage formed by multiple aromatic residues [49,50,51].

An additional N-terminal interaction element has been identified in many JmjC KDMs that resides at varying distances away from the JmjC domain and is referred to as the JmjN domain [20]. In KDM2A and the KDM4 and KDM5 families, it interacts extensively with the catalytic JmjC domain and provides structural integrity without participating in active site formation [17, 52,53,54]. However, modeling and other recent analysis suggests that the JmjN-like fold exists in all KDM families [54, 55]. Thus, the JmjN domain is not a “domain” per se, but an integral part of the catalytic core. Therefore, it should be considered that the JmjC domain in several KDM families has had additional insertions over evolution, and in one case, in the KDM5 family, other domains have been included in the insertion.

In the following sections, each KDM family is discussed individually. First, a general description of the family and its members is given which includes our present knowledge of their relationship to cancers. A table containing all known NMR and crystal structures containing a family’s one or more domains, with cofactors and sometimes peptide substrates, is supplied at the end of each section. It is important to remember that a demethylase’s catalytic and other domains do not act in isolation but interact with each other and many other proteins. In fact, there are three main components that confer specificity to a demethylase: the active site of its catalytic domain, the other domains that the demethylase contains, and the other proteins with which the demethylase participates in multicomponent complexes. Therefore, secondly, each of these attributes for each family will be discussed. At the end of the chapter, there will be a short review that highlights the discovery of KDM inhibitors for which the information acquired could aid in drug development.

4 The KDM1 Family

KDM1A was shown to be a histone demethylase by the Shi group in 2004 [12]. At the same time, a second human flavin-dependent histone demethylase, KDM1B, was identified through a domain homology search of genomic databases [12]. In 2009, the Mattevi laboratory isolated and confirmed the flavin-dependent demethylation activity of KDM1B, noting specificity for H3K4me1/2, despite the relatively low sequence identity to KDM1A of less than 25% [56].

The KDM1 family primarily demethylates H3K4me2 and H3K4me1. H3K4 methylation is a gene activation marker [57]. These epigenetic marks are located in open chromatin, primarily in transcription factor binding regions, including promoters and enhancers that positively regulate expression of genes and can be located many thousands of base pairs down- or upstream of a gene [58,59,60]. These genomic loci are commonly devoid of H3K4me3 [61]. KDM1A appears to regulate histone methylation at promoters, while KDM1B is found in transcriptional elongation complexes and removes H3K4 methyl markings in gene bodies, thereby facilitating gene expression by reducing spurious transcriptional initiation outside of promoters [62]. KDM1B is highly expressed in oocytes and is required for de novo DNA methylation of some imprinted genes, a function dependent on its H3K4 demethylase activity [63].

While KDM1A and KDM1B each share a SWIRM domain and a C-terminal catalytic AOD domain, these demethylases have distinct functions and domains that mediate interactions with other biomolecules (Fig. 2) [64]. The N-terminal regions of the KDM1 proteins have no predicted conserved structural elements, but do contain nuclear localization signals for nuclear import [63, 65, 66].

Fig. 2
figure 2

Representative crystal and domain structures of (a) KDM1A and (b) KDM1B

There are isoforms of KDM1A resulting from alternative splicing events [67] in which some KDM1A isoforms acquire a new substrate specificity. One of these isoforms, LSD1n, targets H4K20 methylation, both in vitro and in vivo, and is involved in neuronal activity–regulated transcription that is necessary for long-term memory formation [68]. Another isoform, LSD1 + 8a, functions as a co-activator by demethylating the repressive H3K9me2 mark. LSD1 + 8a interacts with supervillin; the LSD1 + 8a/supervillin-containing complex demethylates H3K9me2 and thereby regulates neuronal differentiation [69].

KDM1A is overexpressed and/or correlated to poor outcomes such as shorter survival, relapse, high tumor grade, and metastasis in several cancers including prostate cancer [70, 71], bladder cancer [72], neuroblastoma [73], breast cancer [74, 75], non-small cell lung cancer [76], hepatocellular cancer [77], oral cancer [78], colon cancer [79], and sarcomas [80, 81]. Knockdown or inhibition of KDM1A decreased cell growth [71,72,73, 77,78,79, 82,83,84,85,86] and migration/invasion [76, 78, 87, 88], as well as increased differentiation [73, 82, 83] in multiple cancer types, both solid and non-solid. The oncogenic activities of KDM1A have been studied extensively in hematological malignancies, and KDM1A was found to be a major contributor to stemness [82, 83]. KDM1A inhibitors are currently in clinical trials for the treatment of particular leukemia subtypes [89]. In contrast to other studies, one report indicates that KDM1A restrains invasion and metastasis in triple-negative breast cancer [90].

Any roles KDM1B may be playing in cancer are just beginning to be elucidated [91, 92]. KDM1B directly ubiquitylates and promotes proteasome-dependent degradation of O-GlcNAc transferase (OGT), and inhibits A549 lung cancer cell growth in a manner dependent on this E3 ligase activity, but not its demethylase activity [92].

4.1 KDM1 Active Site

It has been shown that KDM1A requires a sufficiently long peptide consisting of the first 20 N-terminal amino acids of the H3 histone tail for productive binding [93]. In contrast to many other H3K4 binding proteins where the peptide has an extended conformation, several crystal structures of KDM1A with H3 peptide show that the peptide is severely compressed and has a serpentine shape in a deep active site cavity of KDM1A [94]. The peptide binds in a funnel-shaped pocket adopting a folded conformation in which three structural elements were identified: a helical turn (residues 1–5) located in front of the flavin molecule, a sharp bend (residues 6–9), and a more extended stretch (residues 10–16) that remains partially solvent-exposed on the rim of the binding pocket. The H3 polar residues Arg2, Thr6, Arg8, Lys9 and Thr11, in addition to the N-terminal amino group, lie in well- defined pockets, forming specific and extensive electrostatic and hydrogen- bonding interactions with the surrounding KDM1A residues. The Arg2 residue of the histone H3 tail is essential for stabilization of this tail conformation in the binding site of KDM1A, due to the formation of intrapeptide hydrogen bonds with the side chain of Ser10, and main chain of Gly12 and Gly13. Any disruption of these precise interactions explains the negative effect that nearly all epigenetic modifications (away from H3K4) have on KDM1A–H3 binding [93,94,95].

In a KDM1B-H3K4me1(1–26) crystal structure, the H3K4me1 peptide extends away from the catalytic cavity and interacts with KDM1B at a second binding site composed of two loops within the linker region of KDM1B [46]. Biochemical analyses indicate that this second binding site is important for substrate recognition and essential for the demethylase activity of KDM1B. KDM1A lacks this second binding site.

4.2 KDM1 Helper Domains

The SWIRM domain was first found in and named after the protein subunits Swi3, Rsc8 and Moira of SWI/SNF-family chromatin remodeling complexes. SWIRM has a compact fold composed of 6 α helices, in which a 20 amino acid long helix (α4) is surrounded by 5 other short helices. The SWIRM domain structure can be divided into the N-terminal part (α1–α3) and the C-terminal part (α4–α6), which are connected to each other by a salt bridge. SWIRM domains are highly conserved amongst chromatin associating proteins and have been shown to bind DNA [96, 97]. However, the residues that compose the typical DNA-binding interface are not conserved and are partially blocked by their interaction in KDM1 proteins with the AOD [45, 98, 99]. This ~85 amino acid domain is believed to help maintain the structural integrity of KDM1 AOD and acts as an anchor site for a histone tail.

In KDM1A, the AOD domain is interrupted with the inclusion of the Tower domain that is absent from KDM1B. KDM1A was originally identified as a component of transcriptional repressor complexes [100, 101] and many of these complexes are formed through this domain. Tower is an ~100 amino acid α-helical, antiparallel coiled-coil domain. This domain is infrequently found in eukaryotic proteins, but is in many prokaryotic proteins involved with intermolecular protein recruitment, membrane docking, and membrane translocation functions [102].

KDM1B lacks a Tower domain, but does contain a novel C4H2C2-type zinc finger (ZnF) and a CW-type zinc finger (Zf-CW) [46, 47]. The ZnF domain is required for KDM1B enzymatic activity through its interaction with the SWIRM domain [103, 104]. Mutations which disrupt the ZnF domains or relays of interactions among the ZnF-SWIRM-AOD may lead to subtle conformational alterations in the AOD that in turn impair the incorporation of FAD, and consequently its enzymatic activity [104]. The surface of the C4H2C2- type zinc finger shows a marked concentration of basic residues, and thus may impact demethylase substrate specificity or positioning within nucleosomal DNA. Additionally, these residues may facilitate interactions with coregulatory molecules or serve to recruit transcriptional machinery, such as phosphorylated RNA polymerase II [62, 104].

4.3 KDM1 in Multicomponent Complexes

The Tower domain of KDM1A forms a complex with the C-terminal domain of CoREST [105], the corepressor for the transcriptional repressor RE1-Silencing Transcription factor (REST) [106,107,108]. KDM1A is typically associated with CoREST and required for it to demethylate nucleosome substrates. This subcomplex is found in combination with histone deacetylase (HDAC) 1 or 2, forming a stable larger core complex recruited by many chromatin remodeling multiprotein complexes [100, 101, 109,110,111]. The CoREST corepressor is necessary for maintaining a repressive chromatin environment in important physiological contexts like neural cell differentiation [105, 112,113,114].

While structural studies of KDM1A co-crystalized with an H3 substrate peptide revealed that the enzyme active site cannot productively accommodate more than three residues on the N-terminal side of the methylated K4, there is evidence in vivo that KDM1A demethylates other substrates. For instance, when associated with the androgen receptor (AR), KDM1A appears to remove repressive methyl groups from H3K9, thereby enhancing AR-dependent gene transcription and resulting in prostate tumor cell proliferation [71]. KDM1A may do this by binding factors that dictate its substrate specificity. Protein kinase C beta 1 (PKCB1)-mediated phosphorylation of H3 threonine-6 also has been proposed as a mechanism to block the H3K4 site and shift the specificity of KDM1A to H3K9 [115]. Recently, EHMT2 (euchromatic histone-lysine N-methyltransferase 2 or G9a) was found to methylate KDM1A at Lys114. KDM1A K114me2, but not unmethylated KDM1A peptide, specifically interacts with the CHD1 (chromodomain-helicase- DNA-binding protein 1) double chromodomain, thus indicating that CHD1 is a reader of KDM1A K114me2. Methylation of KDM1A at Lys114 by EHMT2 and recruitment of CHD1 to AR-binding regions are key events controlling chromatin binding of AR. Thus, the dimethylation of KDM1A at Lys114 appears to ultimately control AR-dependent gene expression [116].

KDM1A appears to be recruited by many CoREST-like and other proteins to form complexes that perform coregulatory or scaffolding functions [64]. In one instance, KDM1A is an integral component of the Mi-2/nucleosome remodeling and deacetylase (NuRD) complex, adding histone demethylation activity to this complex [90]. The NuRD complex contains two other catalytic subunits, the deacetylase HDAC1 and the CHD4 ATPase, both of which are essential for the regulation of gene expression and chromatin remodeling [117]. KDM1A/NuRD complexes regulate several cellular signaling pathways including TGFβ1 signaling pathway that are critically involved in cell proliferation, survival, and epithelial-to-mesenchymal transition [90].

Temporal expression patterns of specific components of KDM1A complexes modulate gene regulatory programs in mammalian development [63, 67, 118, 119]. Any interruption of these patterns, their transcriptional control and/or mutation of components can lead to cancer [64, 120] and other disease [121]. KDM1A is overexpressed in a variety of tumors, and its inactivation or downregulation can inhibit cancer progression [122, 123]. KDM1A targeting inhibitors are an avenue for anticancer drug discovery and will be discussed in a later section of this chapter.

Compared to KDM1A, there is much less known about the multicomponent complexes of KDM1B. In highly transcribed, H3K36me3-enriched coding regions downstream of gene promoters, KDM1B aids in the maintenance of H3K4 and H3K9 methylation by associating with a larger complex that includes Pol II and other elongation factors, as well as the SET-family histone methyltransferases NSD3 and G9a, which methylate histone H3K36 and H3K9 sites, respectively [62]. In addition, the H3K36me3 reader NPAC/ GLYR1 is likely part of this complex as well, which augments the KDM1B demethylation of H3K4me1/2 by binding at its AOD/SWIRM interface [47].

A recent kinetic study showed there is a tight-binding interaction between full-length unmodified histone H3 and KDM1A, which suggests the existence of a secondary binding site on the demethylase surface available for complex formation. The contact between H3 and KDM1A likely occurs through an extensive interaction interface that contributes significantly to its recognition of substrates and products [124]. Apparently, there is still much to discover about KDM1 demethylase function and its control (Table 2).

Table 2 Structures containing domains of KDM1 demethylases

5 The KDM2 Family

The KDM2 family (Fig. 3) contains the first JmjC KDMs to be discovered and to be established as conserved in eukaryotes from yeast to humans [15]. KDM2 specifically demethylates H3K36me2, and to a lesser extent H3K36me1, which are histone modifications that are associated with transcriptional repression. There have been some indications that KDM2B is also a H3K4me3 demethylase [121], but this observation has only been made in vivo and not in vitro [135]. In some instances, one could imagine that KDM2B is in a complex with a KDM5 family member, as suggested by one study [136].

Fig. 3
figure 3

Representative crystal and domain structures of KDM2 family

KDM2A is over-expressed and correlated to poor prognosis in breast [137, 138], non-small cell lung [139], and gastric cancer [140]. Several studies show that knockdown of KDM2A decreases cell growth [138,139,140], angiogenesis [138], invasion/migration [139, 140] and metastasis [140]. KDM2A promotes tumorigenicity through upregulation of target genes such as JAG1 in breast cancer [138] and DUSP3 in lung cancer [139]. In contrast, one study found that KDM2A knockdown had an opposite effect in breast cancer, increasing invasion, migration, and angiogenesis [141].

KDM2B is implicated in the pathology of breast cancer [142], pancreatic cancer [136], myelodysplastic syndromes [143], and acute myeloid leukemia [144]. Knockdown of KDM2B reduced cancer cell growth [136, 142, 144], as well as impaired stem cell self-renewal [142, 145] and transformation [135, 145, 146]. It is linked to senescence [135, 142, 147] and metabolism [146, 148] control. KDM2B has been shown to regulate cell cycle and senescence associated genes such as p15 and p16 [135, 144, 147].

5.1 KDM2 Active Site

Crystal structures of KDM2A with H3K36me2/1 peptides [53] (Table 3) reveal a narrow binding channel that can perfectly fit the specific peptide sequence H3G33 and H3G34 close to H3K36 mark. Any larger side chain in these positions would result in steric hindrance. A pocket binds Pro38 and stabilizes a sharp turn in the H3 backbone. The side chain of Tyr41 binds in a pocket on the demethylase surface through van der Waals interactions and hydrogen binding. Residues Gly33, Gly34, Pro38, and Tyr41 are only found near H3K36 and such residues do not flank any other lysine methylation sites on histone H3 or H4.

Table 3 Structures containing domains of KDM2 demethylases

Surprisingly, KDM2A bound with H3K36me3 peptide, the inactive substrate for KDM2A, could be crystallized. Comparison of structures with H3K36me3 peptide with those with substrate H3K36me2/1 peptide and/or different cofactors suggests that a third methyl group on H3K36me2 may sterically hinder an axial-to-in-plane conversion of the 2OG positioning required for catalysis [149].

5.2 KDM2 Helper Domains and Multicomponent Complexes

Both KDM2 homologs contain a zinc finger CXXC domain that specifically recognizes non-methylated CpG dinucleotides [150], seemingly targeting these histone demethylases to the so-called genomic regions known as CpG islands (CGIs; these contain a high density of CpG dinucleotides where the cytosine nucleotide is primarily not methylated) that are associated with ~70% of mammalian gene promoters and gene regulatory units [151,152,153,154]. When KDM2B recognizes non-methylated DNA in CGIs, it recruits the polycomb repressive complex 1 (PRC1) that then contributes to histone H2A Lys119 ubiquitylation (H2AK119ub1) and gene repression [155, 156]. KDM2B associates with a noncanonical PRC1 to regulate adipogenesis [157]. Furthermore, KDM2B, via its F-box domain, functions as a subunit of the CUL1-RING ubiquitin ligase (CRL1/SCFKDM2B) complex where SCF is an acronym for combination of Skp, Cullin, F-box proteins. KDM2B targets c-Fos for polyubiquitylation and regulates c-Fos protein levels [158]. Another paper suggests KDM2B has unexpected E3 ubiquitin ligase activity. The F-box in KDM2B shows E3 ligase activity in vitro, but has not been characterized further in vivo [159].

The F-box domains encoded by KDM2A and KDM2B have 78% protein sequence identity. This suggests that KDM2A may also recognize CpG through its CXXC domain and is likely to form a functional SCF E3 ligase [137]. Interestingly, KDM2A and KDM5B had the highest frequency of genetic amplification and overexpression in breast cancer among 24 KDM genes tested [137]. KDM2A had the highest correlation between copy number and mRNA expression, and high mRNA levels of KDM2A were significantly associated with shorter survival of breast cancer patients. KDM2A has two isoforms: the long isoform that is the whole protein and a short form that lacks the N-terminal JmjC domain but contains all other motifs, including the CXXC and F-box domains. It is this short form of KDM2A that has oncogenic potential and functions as an oncogenic isoform in a subset of breast cancers [137].

6 The KDM3 Family

The KDM3 family (Fig. 4) contains three members in humans, but only two, KDM3A and KDM3B, are fully verified demethylases, which act upon H3K9me2/1. The domains of the KDM3 family encompass a C2HC4 zinc finger followed by a ~225-residue long JmjC domain which shows 86% similarity between KDM3A and KDM3B. In between these domains lies a LXXLL motif known to be involved in nuclear hormone-receptor interactions [160, 161].

Fig. 4
figure 4

Representative crystal and domain structures of KDM3 family

KDM3A is a crucial regulator of spermatogenesis, embryonic stem cell self-renewal, and metabolic gene expression. Both KDM3A and KDM3B may have roles in sex determination [162,163,164,165,166,167,168]. KDM3A expression is upregulated in lung cancer [169, 170], gastric cancer [171], neuroblastoma [172], Ewing sarcoma [173], bladder cancer [170], renal cell carcinoma [174], and hepatocellular carcinoma [175]. Additionally, it is implicated in the tumorigenesis of multiple myeloma [176], prostate cancer [177], and colon carcinoma [178, 179]. Knockdown or inhibition of KDM3A inhibited growth [169,170,171, 173, 174, 176,177,178,179,180,181] as well as migration or invasion [169, 172, 174, 179] in several tumor cell types. KDM3A was shown to control expression of several well-known proto-oncogenes including c-Myc [177], HOXA1 and CCND1 [170]. It has been shown to be regulated by hypoxia in cancers [175, 178] and to play a role in angiogenesis [180], further supported by a report that expression of KDM3A is higher in hypoxic environments and near blood vessels in renal cell carcinoma [174]. Contradictorily, one study reports that KDM3A acts as a tumor suppressor in human germ cell-derived tumors like embryonal carcinomas, seminomas, and yolk sac tumors [182].

The biological functions of KDM3B are not as well characterized. The human gene for KDM3B is located at 5q31, a chromosomal area that is often deleted in malignant myeloid disorders, including acute myeloid leukemia and myelodysplasia [183]. The enforced expression of this demethylase in a cell line carrying a 5q deletion inhibits clonogenic growth, indicating that loss of KDM3B may be involved in the pathogenesis of these cancers and that KDM3B may have tumor suppressor activities. Further strengthening its role as a tumor suppressor, high KDM3B expression was correlated to better disease-free survival after mastectomy in breast cancer patients [184]. However, contrary to the above studies, KDM3B is overexpressed in acute lymphoblastic leukemia and displays specific activity in vitro and in vivo in leukemogenesis. In this setting, it acts as a transcriptional coactivator to repress differentiation [185]. KDM3B is amplified in non-small cell lung cancer [186].

A third member of the family, KDM3C, has no verifiable demethylase activity on H3K9 peptides in vitro, but seems to in cells [187,188,189]. KDM3C inhibits the neuronal differentiation of human embryonic stem cells and has been found mutated in intracranial germline tumors [188, 190, 191]. KDM3C is reported to play a role in the maintenance of leukemias by functioning as a coactivator for key transcription factors, where its knockdown resulted in apoptosis and impaired growth of cancer cells [189, 190].

6.1 KDM3 Activity and Helper Domains

The C2HC4 zinc finger domain is required for enzymatic activity of KDM3A [192] and the demethylase appears to dimerize through interactions between this domain and the JmjC domain [193]. In addition, if one active site in the KDM3A dimer is mutated, the enzymatic activity of two-step demethylation is significantly decreased. For this KDM family, it appears that the initial conversion of H3K9me2 into H3K9me1 occurs at one active site of the dimer. After the first demethylation step is finished, allosteric regulation of substrate channeling occurs, the monomethylated substrate binds, and conversion of H3K9me1 into H3K9me0 takes place at the second site [193]. Another observation is that one residue, Thr667, contributes to the H3K9me1/2 substrate specificity of wild-type KDM3A: a T667A mutation alters specificity towards H3K9me2 [187]. Thr667 may aid in aligning the methyl group of monomethylated H3K9 correctly in the active site center, presumably bringing it in close proximity to the iron so that the reaction can be catalyzed.

While no papers discussing KDM3 crystal structures have been published, one KDM3 crystal structure has been deposited in the PDB databank (PDB: 4C8D) of the catalytic region of KDM3B (residues 1380–1720) illustrating an unusual JmjC architecture (Table 4). An N-terminal motif proceeding the JmjC domain comprises several α-helices and two three-stranded anti-parallel β-sheets that form β-extension motifs that buttress each side of the central JmjC β-barrel. One of the three-stranded β-sheets is located near the entrance of the active site, implicating it in recognizing the H3 peptide substrate.

Table 4 Structures containing domains of KDM3 demethylases

6.2 KDM3 in Multicomponent Complexes

Several studies have indicated that KDM3A has a role in regulating hypoxia-inducible genes through interaction with transcription factors that are targeted to KDM3A under hypoxic conditions [178, 194,195,196,197]. Hypoxic conditions have been linked to enhanced tumor growth [194]. Hypoxia is commonly found in solid tumors where the access of anticancer drugs is restricted, and the hypoxia allows for a selective environment for aggressive cancer cells [196, 198]. KDM3A has been shown to maintain some demethylase activity even under severe hypoxic conditions [199]. KDM3A exhibits hormone-dependent recruitment to androgen-receptor target genes through interaction with the androgen receptor (AR) to upregulate AR target gene expression [192].

7 The KDM4 Family

This demethylase family (Fig. 5) is probably the most examined of all the JmjC demethylase families, especially KDM4A; presently, there are over 55 crystal structures of this enzyme deposited in the PDB which includes complex structures with >20 inhibitors (Table 5, discussed below). There are many excellent published reviews [52, 200,201,202]. The KDM4 family has specificity for two regions of H3 with different sequences. Members act on H3K9me3/me2 and, in some cases, H3K36me3/me2. H3K9me3 demethylation promotes an open chromatin state, contributing to the transcriptional activation of promoter regions [200]. KDM4A and KDM4B occupancy is fairly evenly distributed across different genomic regions, while KDM4C localizes predominantly to H3K4me3-containing promoter regions [203,204,205].

Fig. 5
figure 5

Representative crystal and domain structures of KDM4 family

Table 5 Structures containing domains of KDM4 demethylases

In humans, this family contains five known members. The KDM4A-C proteins share more than 50% sequence identity; each contains JmjN, JmjC, two plant homeodomains (PHD) and two hybrid Tudor domains that form a bilobal structure, with each lobe resembling a normal Tudor domain. KDM4D and KDM4E, in contrast, are considerably shorter proteins that lack the C-terminal region, including the PHD and Tudor domains [17, 52]. Biochemical studies indicate that KDM4A-C catalyze the removal of H3K9 and H3K36 di- and trimethyl marks. However, in vivo, KDM4A seems to demethylate only trimethylated residues [19] and has a greater affinity for H3K9me3 over H3K36me3 [16, 20, 206]. KDM4D can only demethylate H3K9me3/me2 [17, 52]. KDM4E meanwhile, catalyzes the removal of two methyl groups from H3K9me3 and also H3K56me3 [207, 208].

The KDM4 family members are associated with cancer in several ways, summarized for most family members below (reviewed in [52, 200, 201, 209]). Several members are involved in hypoxia [210, 211] and DNA mismatch repair [212]. KDM4A, B, and C are required for the survival of acute myeloid leukemia cells [213].

KDM4A expression is upregulated and/or correlates to poor outcomes in many cancers including breast [214, 215], prostate [17, 216, 217], lung [218,219,220], bladder [220], gastric [221], and endometrial carcinoma [222, 223]. KDM4A overexpression led to the development of prostatic intraepithelial neoplasia, and combined overexpression of KDM4A and ETV1 resulted in prostate carcinoma formation in Pten +/− mice [217]. Furthermore, overexpression of KDM4A has been shown to cause localized copy gains and DNA re-replication in tumor cells [224]. Knockdown or knockout of KDM4A inhibits growth [217, 221,222,223, 225,226,227,228], migration/invasion [222, 223, 227], and metastasis [87] in several cancer models, and provokes apoptosis [218, 221, 226] and senescence [219]. KDM4A regulates target genes such as p27 [223], YAP1 (yes-associated protein 1) [217], ARHI (aplasia Ras homolog member I) [215], CHD5 (chromodomain helicase DNA-binding domain 5), and activating protein 1 (AP1) family genes [87]. It is associated with cancer-related proteins such as the androgen receptor (AR) [223], p53 [226], and SIRT2 (sirtuin 2) [228].

KDM4B is overexpressed and correlates to adverse outcomes in many cancers including endometrial cancer [223], luminal breast cancer [214], colorectal cancer [229], bladder [230], lung [230], prostate [17, 231], gastric [232,233,234], hepatocellular carcinoma [235], and osteosarcoma [236]. Knockdown of KDM4B inhibits growth [211, 225, 229, 230, 232, 237,238,239], migration/invasion [223, 229, 233], and metastasis [233], and induced DNA damage [239] and apoptosis [232, 239, 240]. It is known to associate with nuclear receptors to drive cancers such as AR [223, 231] and ERα (estrogen receptor α) [211, 237, 238], regulating target genes such as c-Myc [223], CDK6 (cyclin-dependent kinase 6) [230], and ERα target genes such as CCND1 [211, 237].

KDM4C is amplified in basal-like breast cancer [214, 241], esophageal squamous cell carcinomas [242], sarcomatoid lung carcinoma [243], lymphomas [244], and medulloblastoma [245, 246]. Likewise, it is overexpressed and/or associated with negative patient outcomes in basal-like breast cancer [214], esophageal squamous cell carcinoma [242], prostate cancer [17], osteosarcoma [236], and esophageal squamous cell carcinoma [247]. KDM4C knockdown or inhibition prevents growth of several cancer types [17, 210, 244, 248], as well as breast cancer metastasis to the lung [210]. Overexpression of KDM4C was able to transform normal-like breast epithelial cells [241]. KDM4C was reported to interact with HIF1α (hypoxia inducible factor 1 α) [210] and FGF2 (fibroblast growth factor 2) [236], as well as to target p53 pathway gene MDM2 (mouse double minute 2 homolog) [249]. In contrast, one study reports that KDM4C expression is associated with improved breast cancer survival and response to therapy [250]. KDM4D is overexpressed in basal-like breast cancer [214]. KDM4D knockdown blocked the proliferation of colon cancer cells, but surprisingly, KDM4D was shown to bind p53 and activate p21 expression [251].

7.1 KDM4 Active Site

Binding specificity in KDM4 members originates from amino acids surrounding lysines 9 and 36 on histone H3, whereas space and electrostatic environment in the methyl group–binding pocket of these enzymes allow for di- and trimethyl and not the monomethylated lysine residues to position a methyl group productively toward the Fe2+ atom in the catalytic center. Crystal structures and modeling punctuates the importance of certain residues in KDM4 demethylases for defining H3K36me3 recognition [225]. In KDM4A and KDM4B, residues Ile71, Asn86, and Asp135 engage in van der Waals interactions or hydrogen bonds with H3 residues H39 and R40 on the C-terminal side of H3K36me3 while the side chains of Leu75, His90, and Asp139 in KDM4D cannot avoid steric clashes with H39 and R40. The crystal structure of the KDM4D-peptide complex also shows that the R42 side chain of H3 lies in close proximity to Lys91 and Lys92 on the surface of KDM4D, resulting in potential electrostatic repulsion between the enzyme and H3K36me3 peptide. In KDM4A, the corresponding residues, Ile87 and Gln88, possess uncharged side chains, alleviating this electrostatic repulsion. Ile71, Asn86, Ile87, and Gln88 in KDM4A are strictly conserved in KDM4B and KDM4C and mutations of these residues to the corresponding amino acids in KDM4D and KDM4E (i.e., I71L, N86H, I87K, and Q88K) disrupt H3K36me3 demethylation in vitro and in vivo [252].

In KDM4A-C, the C–H•••O hydrogen-bonding network in the active site places one of the three methyl groups of the trimethylated lysine close to the Fe2+ and in an ideal position for catalysis. When dimethylated or monomethylated lysines bind at the active site, the methyl groups are sequestered away from the metal ion by C–H•••O hydrogen bonds. Therefore, the catalysis-competent methyl position is energetically disfavored. For a dimethylated lysine, a rotational movement could allow one of the methyl groups to gain access to Fe2+ for catalysis, probably with less efficiency than a trimethylated lysine [253, 254]. A monomethylated lysine would be completely sequestered and cannot reach the proper positioning for catalysis. Ser288 (whose hydroxyl group forms C–H•••O hydrogen bonds with methyl groups) is frequently substituted by Ala in other JmjC demethylases, such as KDM4D and KDM6A (A1238). The substitution of Ser288 in KDM4A with Ala enhances its activity, especially on dimethylated substrates, indicative of this residue’s role in the determination of the methylation state specificity [20, 254].

7.2 KDM4 Helper Domains

The functions of the two PHD domains in KDM4A-C remain unknown. However, it appears that differential tandem Tudor domain (TTD) binding properties across the KDM4 demethylase family may distinguish the targets of the KDM4 family in the genome. The TTD domain has two shared β-strands that interdigitate to form a bilobal structure, with each lobe resembling a normal Tudor domain. The KDM4A TTD recognizes both H3K4me3 and H4K20me3 [206, 255, 256], while the KDM4B TTD binds methylated H4K20 [257], and the KDM4C TTD is specialized to recognize only methylated H3K4 [203, 258]. In the crystal structure of the KDM4A TTD, the second Tudor domain uses a cluster of aromatic residues, Phe932, Trp967 and Tyr973, to establish an open cage pocket for binding the side chain of H3K4me3 or H4K20me3 while the side chains of the other Tudor domain form intermolecular contacts [206]. However, the H3 and H4 peptides contact the Tudor domains in opposite orientations and at different surfaces of the second hybrid Tudor domain, while the side chains of the other Tudor domain form intermolecular contacts [35, 259].

7.3 KDM4 in Multicomponent Complexes

Recall that H3K4 trimethylation is a hallmark of active promoters that are usually devoid of H3K9 trimethylation, a mark of inactive chromatin. Through its Tudor domain, KDM4A could be recruited to active gene promoters where it would demethylate H3K9 ensuring amplification of gene transcription. As one example of a KDM4 family member in an epigenetic modifying complex, KDM4B is physically associated with and an integral component of the H3K4 methyltransferase mixed-lineage leukemia 2 (MLL2) complex [238]. This complex could potentially be a Tudor domain-independent instance (possibly through PHD) in which KDM4B can simultaneously demethylate H3K9 while H3K4 becomes trimethylated. This KDM4B/MLL2 complex co-purifies with estrogen receptor α (ERα) and is required for ERα-regulated transcription [238]. ERα exhibits greater stability when KDM4B is overexpressed, and ERα can upregulate KDM4B. This creates a positive feedback loop between these two molecules to amplify the estrogen signal [260]. A similar mechanism has been proposed for AR signaling [231]. In this manner, KDM4B has an oncogenic role in both breast and prostate cancers. Another report finds KDM4B and KDM4C work distinctly and combinatorially in different multicomponent complexes in embryonic stem cells that affect their differentiation [204].

8 The KDM5 Family

The human KDM5 family (Fig. 6 and Table 6), specific for the demethylation of H3K4me3/2, encompasses four enzymes: KDM5A/JARID1A/RBP2 (retinoblastoma-binding protein 2), KDM5B/JARID1B/PLU-1, KDM5C/JARID1C/SMCX (selected mouse cDNA on the X), and KDM5D/JARID1D/SMCY (selected mouse cDNA on the Y) [278]. KDM5 members show a high degree of homology in sequence and domain organization [54, 279]. In addition to the catalytic JmjC domain, each contains a JmjN domain, an ARID DNA-binding motif, two or three PHD finger domains and a C5CH2-type zinc finger domain. The KDM5 family is unique among JmjC-containing histone demethylases in that there are identifiable domains, the ARID and PHD, between the JmjN and JmjC. Despite the fact that all members of KDM5 catalyze the demethylation of the same histone mark, they appear to have exclusive functional properties probably because of their different expression profiles and presence in distinct protein complexes [278, 279].

Fig. 6
figure 6

Representative crystal and domain structures of KDM5 family

Table 6 Structures containing domains of KDM5 demethylases

This family of KDMs is the only one to demethylate the H3K4me3 mark. In genome-wide studies, this mark broadly correlates with RNA polymerase II occupancy at sites of active gene expression, and is thought to provide an additional layer of transcriptional regulation. H3K4me3 is known to be associated with transcriptionally active genes or in combination with repressive histone marks [280], such as H3K27me3, at the promoters and transcriptional start sites at the 5 ́-end of important developmental genes [280, 281], keeping them in the “poised for activation” state.

Recently, crystal structures of truncated KDM5A, KDM5B, and KDM5C proteins have been determined [54, 274, 282, 283]. In truncated KDM5 proteins, the ARID and PHD1 domains between JmjN and JmjC are dispensable to activity, while the C5HC2 zinc finger motif is required for its in vivo [284] and in vitro activity [54, 274]. The active KDM5A and KDM5B structures showed that the domain arrangement of this KDM family most closely resembles that of KDM6, despite the fact that the catalytic domain shares the greatest sequence identity with the KDM4 family (33%). The fold of the catalytic JmjC domain is highly conserved with that of KDM6A (PDB ID 3AVS; r.m.s. deviation = 0.46 Å over 107 Cα) and other JmjC demethylases, despite the fact that this region retains only 16% sequence identity with KDM6A. There is a C-terminal helical domain composed of four helices, and a zinc finger C5HC2 motif was found, similar to the GATA-like motif in the KDM6 family (see below).

KDM5 family enzymes have been studied in several types of cancer and cancer processes (for reviews, please see [278, 279]). KDM5A and KDM5B are reported to be amplified or overexpressed in many cancers, and have been shown to play key roles in cancer cell proliferation, drug resistance and metastasis. KDM5A is amplified or over-expressed in several cancers including breast [285], lung [286, 287], hepatocellular [288], and gastric [289, 290] cancers. It is linked to proliferation and senescence control by antagonizing the functions of retinoblastoma protein (pRB) [291,292,293] and suppressing the expression of cyclin-dependent kinase inhibitor genes such as p21, p27, and p16 [286, 288, 289]. In three different genetically-engineered mouse tumor models, knockout of KDM5A significantly prolonged survival [293, 294]. KDM5A has also been shown to play a role in epithelial-mesenchymal transition [287, 295], invasion [286, 294], and metastasis [294, 296]. Additionally, expression of KDM5A is implicated in anti-cancer drug resistance in lung cancer [297], breast cancer [285], and glioblastoma [298].

KDM5B is reported to be overexpressed in breast [284, 299], lung [300], bladder [301], diffuse large B-cell lymphoma [302], prostate [303], colorectal [304], glioma [305], ovarian [306], and hepatocellular [307, 308] cancers. KDM5B has been shown to repress expression of tumor suppressor genes such as BRCA1 and HOXA5, as well as cell cycle checkpoint genes such as p15, p27, and p21 [284, 305, 307, 309]. Furthermore, KDM5B expression is linked to stem cell-like properties and resistance to a targeted therapy in melanoma [310, 311]. Many recent studies link expression of KDM5B to poor prognosis, chemoresistance, and metastasis in a variety of cancers [306, 308, 312,313,314,315].

Several studies indicate that KDM5 enzymes may have tumor suppressive functions in particular contexts. Breast cancer patients with high expression of KDM5A had a better response to docetaxel [184]. Migration and invasion are suppressed in triple negative breast cancer cells when KDM5B is artificially overexpressed [316]. Finally, KDM5C and KDM5D are inactivated or deleted in renal cell carcinoma [317] and prostate cancer [318], respectively. KDM5C knockdown significantly increased growth of renal cell carcinoma cells in a xenograft model [319].

8.1 KDM5 Active Site

Modeling places a trimethylated lysine residue in the active site of KDM5A, surrounded on four sides by Trp470, Tyr472, Asn585, and the metal-ligand water molecule. The aromatic indole ring of Trp470 would be in parallel with the hydrophobic portion of the target lysine. The side chains of Tyr472 and Asn585 would each coordinate one methyl group, whereas the third methyl group would be in close proximity to a metal ligand-coordinated water molecule. During the catalytic cycle, this site would be occupied by the dioxygen O2 molecule that initiates the demethylation reaction by abstracting a hydrogen atom from the substrate.

8.2 KDM5 Helper Domains

The ARID domain binds double-stranded DNA and may be involved in anchoring KDM5 proteins onto linear or nucleosome-wrapped nucleic acid [303, 320, 321]. In the KDM5A structure containing an ARID domain [283], the domain adopts the canonical fold but differs slightly in its loop conformations compared to the NMR structure of the isolated KDM5A ARID domain [320] and may block part of the substrate binding site, suggesting that ARID-PHD1 may interfere with substrate binding until interaction with nucleosome.

The PHD1 domains of both KDM5A and KDM5B have been shown to bind to unmodified H3K4me0 [316, 322,323,324], whereas both of their PHD3 domains have been shown to bind to H3K4me3 [316, 325]. Though KDM5B’s PHD3 domain favors binding to H3K4me3, it will also bind to lower methylation states of H3K4 [316, 325]. The PHD2 domain of KDM5B apparently does not recognize histone [316]. Of note, binding of the PHD1 domain of KDM5D to H3K4me0 is ~30X weaker than that of KDM5B, even though their sequences are very similar. This is likely because the Leu326 residue in the KDM5B sequence is replaced by a phenylalanine in KDM5D, where this bulky side chain may cause steric hindrance and obstruct this interaction with a peptide [316]. The binding of these PHD domains to both the substrate and the product of KDM5 demethylases may seem unusual. Yet, binding of PHD1 to H3K4me0 may provide an anchoring mechanism for KDM5A/B to sense H3K4me3 through PHD3 and slide along the H3K4me3-enriched promoters, demethylating other nearby methylated H3K4 and further spreading the transcriptionally inactive state of chromatin. Interestingly, such a model was proposed in one of the first papers to identify that a KDM5 family member is capable of erasing methyl groups of trimethylated H3K4 [326].

As mentioned above, the C5HC2 zinc finger motif is required for activity. In KDM6A, the interaction of a similar zinc finger domain with the KDM6 JmjC domain is required for activity, and in a KDM6A crystal structure with peptide, the zinc finger domain undergoes rearrangement and aids in recognition of a portion of histone H3 around the substrate H3K27 [327]. A future structure of a KDM5 demethylase with peptide may reveal something similar.

8.3 KDM5 in Multicomponent Complexes

KDM5A interacts with the Sin3B/HDAC complex, and KDM5A and Sin3B/HDAC cooperate in transcriptional repression of a subset of E2F4 target genes through deacetylation, demethylation, and nucleosome repositioning [328]. Similarly, KDM5B copurifies and colocalizes with components of the NuRD complex, indicating that KDM5B and NuRD may cooperate in transcriptional repression [316]. The NuRD complex contains two catalytic subunits, the deacetylase HDAC1 and the CHD4 ATPase, both of which are essential for the regulation of gene expression and chromatin remodeling.

9 The KDM6 Family

This KDM family contains three human demethylases (Fig. 7). KDM6A consists of 1401 amino acids and contains a JmjC catalytic domain and 6 tetratricopeptide repeat (TPR) protein-protein interaction domains [41, 330]. KDM6B consists of 1679 amino acids, but it does not appear to contain any characterized domains other than the JmjC domain [331]; however, sequence analysis suggests that it may also contain similar TPR domains as KDM6A. These two enzymes have 84% sequence similarity in the JmjC domain [330]. KDM6C consists of 1347 amino acids [332] and shares 83% amino acid identity with KDM6A throughout its sequence [333]. It was once thought to be enzymatically inactive [330, 334], but minimal demethylase activity was later demonstrated and appears to be due to a subtle sequence divergence in the JmjC catalytic domain [332]. KDM6C is located on the Y chromosome and partially compensates for some KDM6A functions, some which may be demethylase-independent, as it was demonstrated using knockout mouse models [335]. KDM6C may activate transcription in a gene-specific manner, as there has been no observation of a decrease in global levels of H3K27me3 upon overexpression of KDM6C in HEK 293 T cells. It is thought that this demethylase may be required in male sex determination during development [332].

Fig. 7
figure 7

Representative crystal and domain structures of KDM6 family

KDM6 family members have both pro- and anti-oncogenic roles in cancer, depending on the cell type (reviewed in [336,337,338]). KDM6A is often classified as a tumor suppressor. Inactivating mutations have been reported in medulloblastoma [339], bladder cancer [340, 341], T-cell acute lymphoblastic leukemia (T-ALL) [342, 343], acute lymphoblastic leukemia [344], renal cell carcinoma [317], chronic myeloamonocytic leukemia [345], and in many other solid and non-solid tumors [346]. Its expression was necessary and sufficient to arrest the cell cycle in human fibroblasts by targeting genes encoding Rb-binding proteins, and depleting KDM6A increased proliferation [347]. Re-expression of KDM6A in KDM6A-null esophageal carcinoma cell lines slowed proliferation [346], and knockdown of KDM6A enhanced in vitro and in vivo growth of bladder cancer cells [341].

KDM6A knockout in T-ALL cells increased T-ALL kinetics and decreased lifespan of recipient mice. Overexpression of KDM6A in T-ALL cell lines decreased growth and induced apoptosis [343]. Similarly, knockdown of KDM6A boosted development of T-ALL in mice and sensitized cells to treatment with the EZH2 inhibitor 3-DZNep [342]. In the TAL1-positive subgroup of T-ALL, however, KDM6A is oncogenic, and its knockdown attenuated cell growth and induced apoptosis, while overexpression increased cell growth [348].

There are divergent reports of KDM6A’s role in breast cancer as well. While one study reports that low KDM6A expression predicts poor survival in breast cancer [347], another reports that high KDM6A expression is associated with poor prognosis in breast cancer [349]. The latter is supported by a study that finds KDM6A is overexpressed in breast cancer and correlated to tumor grade [350]. Additionally, knockdown of KDM6A decreased breast cancer cell proliferation, invasion, and lung colonization [349].

KDM6B can act as a tumor suppressor through interactions with p53 [351, 352] and activation of p16 [351, 353, 354], promoting senescence after oncogene induction [353, 354] and differentiation of cancer stem cells [351]. In support, KDM6B expression is reduced in several cancer types [353, 355]. KDM6B knockdown decreased p15 expression, with a concurrent increase in proliferation and decrease in apoptosis in colorectal cancer cells, where low expression predicts poor patient prognosis [355].

On the other hand, KDM6B expression is increased in melanoma [356]. Depletion of KMD6B in melanoma boosted self-renewal, trans-endothelial migration, metastasis, angiogenesis, and macrophage recruitment [356]. Similarly, knockdown of KDM6B reduced tumor growth and induced apoptosis in diffuse large B-cell lymphoma cells [357]. In T-ALL, KDM6B was critical for tumor initiation and maintenance through control of NOTCH1 target genes like HEY1, HES1, and NRARP [343]. Treatment with the pan-KDM6/5 inhibitor GSK-J4 [358] has anti-tumor effects on K27 M H3.3 mutants in brainstem gliomas [359], ovarian cancer cells [360], T-ALL cells [343], and TAL1-positive T-ALL patient derived xenografts [348]. KDM6C may play a role in prostate cancer tumorigenesis [361, 362], but its roles have yet to be fully elucidated.

9.1 KDM6 Helper Domains

A domain lies C-terminal to the JmjC domain of KDM6 demethylases that contains a four α-helix bundle which is bisected between the third and fourth helices by a Zn2+-coordinated GATA-like domain of novel topology [363] containing four conserved cysteine residues that coordinate a zinc ion to stabilize the structure. The JmjC and GATA-like zinc-binding domains in KDM6 proteins pack against each other with a large buried surface area (~4000 Å2). This zinc-binding domain is required for optimal stability and the catalytic competence of the truncated KDM6 proteins observed in crystal structures.

Of note, this zinc-binding domain is involved in recognizing an N-terminal portion (H3A17 to H3T22) of the histone H3 target site [327] (Table 7). The zinc-binding domain undergoes a significant conformational change upon binding to the N-terminal portion of histone H3, and this change exposes a hydrophobic patch composed of His1320, His1329, Leu1342, and Val1356 by displacing Tyr1354, which was masking this hydrophobic patch. Among the residues in the N-terminal portion of histone H3, H3A17 and H3L20 exhibit extensive interactions with this hydrophobic patch. Because H3L20 is found only in the context of the H3K27 target, the zinc-binding domain is likely to serve as a substrate determinant for KDM6A. Thus, KDM6A recognizes a relatively large portion of histone H3 with two domains and this contributes to the highly specific activity of KDM6A toward H3K27.

Table 7 Structures containing domains of KDM6

It is noted here that possible “cross-talk” can exist between different epigenetic marks on a histone molecule. Histone H3A17 and H3A26 can be methylated by CARM1/PRMT4, and the zinc-binding domain and the JmjC domain of KDM6A interact with Arg17 and Arg26, respectively. Because KDM6A tightly holds the charged side chains of the histone H3A17 and H3A26 residues, methylation of Arg17 and Arg26 would decrease or block H3 peptide binding and subsequent KDM6A demethylase activity.

9.2 KDM6 in Multicomponent Complexes

Protein-protein interaction residues in KDM6 demethylases have not been identified, although the TPR repeats are suspected. Similar to the H3K9me3 epigenetic mark, H3K27me3 is tightly associated with inactive gene promoters and acts in opposition to H3K4me3. Like KDM4A-C, KDM6A and B are part of the MLL2 complex [364, 365] and appear to be involved in differentiation, development, and disease [338, 342, 366, 367].

10 The KDM7 Family

The KDM7 family consists of three members (Fig. 8). Each member harbors two domains in its respective N-terminal half: a PHD domain that binds H3K4me3 and a JmjC domain that demethylates H3K27me2/1 via KDM7A, H3K9me2/1 and H4K20me1 via KDM7B, and H3K9me1 via KDM7C [44, 368]. However, KDM7C activity has not been observed in vitro [369]. However, in vivo, KDM7C becomes active through a protein kinase A (PKA)-dependent histone lysine demethylase complex, PHF2–ARID5B [370, 371]. KDM7 family members have been implicated as both oncogenic and tumor suppressive. KDM7A expression was upregulated by nutrient starvation, and under those conditions its expression suppressed xenograft tumor growth by restraining angiogenesis [372].

Fig. 8
figure 8

Representative crystal and domain structures of KDM7 family

KDM7B is overexpressed in prostate cancer [216, 373], breast cancer [374], laryngeal and hypopharyngeal cancer [375], non-small cell lung cancer [376], and esophageal cancer [377]. It was shown to target and promote expression of onco-miRs miR-21 [376] and miR-125b [373]. Knockdown of KDM7B in cancer cells attenuates growth [216, 373, 376, 377] as well as migration/invasion [216, 377], and induces apoptosis [216, 373, 376]. In contrast, KDM7B expression and activity is critical for response to all-trans retinoic acid treatment by acute promyelocytic leukemia cells [378].

KDM7C expression is increased in esophageal squamous cell carcinoma and associated with poor overall survival [379]. However, most studies point to a tumor suppressor function for KDM7C in cancer. It is deleted and/or downregulated in breast cancer [380], head and neck squamous cell carcinoma [381], as well as colon and stomach cancers [382]. KDM7C was shown to associate with p53, and knockdown of KDM7C in p53 competent cells led to decreased sensitivity to genotoxic drugs, as well as reduced drug-induced expression of p21 [382]. Finally, KDM7C was shown to be necessary for treatment-induced mesenchymal to epithelial transition (MET) in breast cancer cells, which led to loss of their tumor initiating ability [383].

10.1 KDM7 Active Site

In the structure of KDM7B with histone peptide, the target H3K9me2 lies in the active site right next to the Fe2+ and the 2OG inactive analog N-oxalylglycine (NOG) [44] (Table 8). One of its terminal N-CH3 groups projects toward the aromatic ring of Tyr234, and the other methyl group points toward Asp249 and Asn333, forming two hydrogen bonds of C–H•••O type. The dimethylated terminal nitrogen atom carrying the lone pair of electrons forms a hydrogen bond with one of the oxygen atoms of NOG. The active site cannot accommodate a trimethylated lysine because the third methyl group would cause repulsive tension with NOG. Phe279 makes van der Waals contacts with Ile248 and Ile318, forming a hydrophobic core supporting the backbone of Fe2+-coordinating residues His247, Asp249, and His319. Substitution of Phe279 to serine is associated with inherited X-linked mental retardation [384,385,386,387].

Table 8 Structures containing domains of KDM7

In C. elegans KDM7A, NOG is stabilized by residues Asn421, Thr492, and Tyr505 [388]. The methylated side chain of H3K9me2 (or H3K27me2) is checked by Phe482 and Phe498 through hydrophobic interactions. One of the methyl groups of dimethylated lysine interacts with the side chains of Asp497 and Asn581 through two C–H•••O hydrogen bonds.

KDM7C appears to be an inactive demethylase. The metal binding site in KDM7C closely resembles the Fe2+ sites in other JmjC domains [369]. However, KDM7C contains a tyrosine (Tyr321) in the place of the fifth ligand, and the longer side chain of Tyr321 makes the Fe2+ move away from the corresponding binding site in KDM7B, an active demethylase. The small movement of the ferrous iron, induced by the presence of Tyr321, could position the reactive oxygen in a non-reactive mode.

10.2 KDM7 Helper Domains

KDM7A/B structures provided one of the first examples of how helper domains can both upregulate and/or downregulate JmjC demethylase activity through contributions associated with steric effects. The presence of H3K4me3 on the same peptide as H3K9me2 makes the doubly methylated peptide a significantly better substrate of KDM7B, resulting in a 12-fold increase in enzymatic activity as revealed by activity assays [44, 389,390,391]. By contrast, the presence of H3K4me3 diminishes the H3K9me2 demethylase activity of KDM7A with no adverse effect on its H3K27me2 activity, because the distance between the H3K4me3 and H3K27me2 marks is long enough for occupation of the PHD and JmjC domain pockets simultaneously [44, 388]. Differences in substrate specificity between the two enzymes are explained by a bent conformation of KDM7B, allowing each of its domains to engage their respective targets, and an extended conformation of KDM7A, which prevents its JmjC domain from accessing H3K9me2 when its PHD domain engages H3K4me3. Thus, the structural linkage between the PHD domain binding to H3K4me3 and the placement of the catalytic JmjC domains relative to this ‘on’ H3K4me3 epigenetic mark determine which repressive marks are removed by both demethylases. Thus, the KDM7A and KDM7B JmjC domains on their own are promiscuous enzymes; it is the associated PHD domains and linker—a determinant for the relative positioning of the two domains—that are mainly responsible for substrate specificity. It should also be noted that KDM helper domains can affect the orientation of peptide binding: a peptide in complex with a KDM6 demethylase has an opposite orientation across the JmjC domain compared to a peptide in complex with a KDM7 demethylase [44, 327, 388].

A structural study on C. elegans KDM7A suggested that the extended conformation between the PHD and Jumonji domains might enable a trans-histone peptide-binding mechanism, in which H3K4me3 associated with the PHD domain and the H3K9me2 bound to the Jumonji domain could be coming from two separate histone H3 molecules of the same nucleosome or two neighboring nucleosomes [388]. However, this trans-binding mechanism can be excluded for human KDM7A because the presence of an H3K4me3 in trans or in cis with H3K9me2 substrate peptide strongly inhibits KDM7A activity toward H3K9me2 [44]. Nevertheless, the trans-binding mechanism is an attractive model for KDM7B if the flexible loop between the PHD and JmjC enables the enzyme to adopt an extended conformation to allow binding of two peptides simultaneously. The trans-binding mechanism could explain the finding that KDM7B also functions in vivo as an H4K20me1 demethylase while its PHD domain interacts with H3K4me3/me2 in the context of nucleosome [392, 393]. However, if this were the case, an explanation would be needed as to why KDM7B is only active on monomethylated H4K20, whereas it is active on mono- and dimethylated H3K9 and H3K27. One possibility is that only H4K20me1 co-exists with H3K4me3/me2 in vivo.

10.3 KDM7 in Multicomponent Complexes

The C-terminal halves show little homology among the family members and do not contain any known domains. Nonetheless, C-terminal parts of members are essential for their gene regulatory functions. For example, it was found that KDM7B binds to RNA polymerase I/II, KMT2, HCF1, E2F1, ZNF711 and RAR, under the control of the C-terminal portion of KDM7B [389, 390, 392, 394]. In addition, KDM7C is associated with p53 through its C-terminal region [382]. It appears probable that the variability of the C-terminal halves of KDM7 members provides functional diversity by choosing different histone demethylase partners for transcription.

11 Molecular Basis of KDM Inhibition and Development of Inhibitors into Drugs

At present, current epigenetic therapies primarily involve inhibitors of DNA demethylation and histone deacetylation [400]. Considering the significant implication of KDMs in the development of various diseases, a thorough understanding of their molecular mechanism and effective therapeutic inhibition is of considerable interest, but at its infancy. Further characterization of many demethylases is proceeding through both functional studies and the development of small molecule inhibitors targeted against them. These studies will be invaluable for our understanding and treatment of cancer. There are two possible ways by which a demethylase-inhibiting drug may be able to halt or even prevent cancers. It can repress oncogenes and/or activate tumor-suppressor genes that are deregulated by methylation processes [401, 402] or overcome resistance to chemotherapy [283, 403, 404]. Transient and reversible drug resistance develops in certain cancer cell populations during treatment with cancer drugs. KDM5A is as at least one chromatin-modifying enzyme required for establishing a drug-tolerant subpopulation [297]. Reduced methylation of H3K4 has also been linked to poor prognosis in cancer patients [405].

The many crystal structures of demethylases have revealed substantially conserved Fe2+ and 2OG binding sites; yet, differences in Fe2+ and 2OG binding sites are idiosyncratic to each KDM family and may be able to be exploited for the development of selective inhibitors. For instance, N198 in KDM4A and N1156 in KDM6A establish a hydrogen bond at the back of the pocket with the carboxylic moiety of 2OG, while in KDM2A and KDM7 family members, the asparagine is replaced by a tyrosine that causes a different Fe2+ coordination of the carboxylic moiety of the cofactor and a loss of the hydrogen bond. In a KDM4A-inhibitor structure [263], a π-π stacking interaction with F185 at the front part of the pocket can be observed; this phenylalanine is only conserved within the KDM4 and KDM5 families [406]. KDM2A/6A/7 show a threonine at this location that would prohibit a π-ring system at this position. Another example is the invariant cysteine in the active site of the KDM5 family (Cys-481 in KDM5A), which spatially replaces residues in other KDMs (i.e. Pro-1388 of KDM6B). Exploring the interaction between this noncatalytic cysteine and studied inhibitors could provide an avenue for improved potency, selectivity, and prolonged on-target residence times of inhibitors specific for the KDM5 family. For example, an approach of using reversible covalent inhibitors that target noncatalytic cysteine residues to achieve prolonged and tunable residence time has recently been demonstrated with protein kinases [407]. Both reversible [408] and irreversible inhibitors [122, 123, 409, 410] have been made against KDM1A, and some of these inhibitors have entered into clinical trials as drugs for cancers such as acute myeloid leukemia [411] and small cell lung carcinoma [412, 413].

The pace of inquiry in the KDM inhibitor field is accelerating: the number of papers published and applications for patents in the last several years are a testament to the presupposition that study in this area will lead to great discovery of KDM chemical probes and drug candidates [122, 273, 277, 411, 414,415,416]. A comprehensive review is beyond the scope of this chapter. A few highlights in this area will be discussed.

11.1 Inhibitors of KDM1 Demethylases

There are many compounds that are KDM1 inhibitors (Fig. 9a). The AOD catalytic domain of KDM1 is homologous to those of the monoamine oxidases (MAOs) A and B and this has facilitated studies for this KDM family. Consequently, several well-studied MAO inhibitors, including phenelzine and tranylcypromine (TCP), an FDA-approved treatment for psychological disorders [417], have been demonstrated to also inhibit KDM1A [418, 419]. A mechanism-based irreversible inhibitor, TCP forms a covalent adduct with the FAD cofactor within the active site of the enzyme [94, 126, 419]. The application of TCP as an inhibitor of KDM1A has provided promising proof-of-principle data in mouse models and leukemia cell lines [82, 83]. However, such non-selective amine oxidase inhibitors could obviously have adverse effects and are not ideal solutions for KDM1 inhibitors. Therefore, derivatives of tranylcypromine have been made, and the first structures with enhanced potency and target selectivity for KDM1A were obtained through modification of the phenyl group of TCP using crystal structures of KDM1A with TCP or a KDM1-selective peptide-based inhibitor [127, 420].

Fig. 9
figure 9

Representative inhibitors of (a) KDM1 family and (b) JmjC demethylase families

Since KDM1 can specifically recognize the twenty-one amino acids from the N-terminal tail of histone H3, inhibitors containing these twenty-one amino acid long peptides from the histone H3 N-terminal tail with modifications on target lysine have been synthesized; a propargyl-Lys-derivatized peptide functions as a potent and selective time-dependent inactivator of KDM1A [421]. However, even a H3 peptide with methionine replacement for the target lysine appears to be a good inhibitor (Ki = 40 nM) and a structure was determined [95]. Peptides derived from SNAIL1 and INSM1 sequences could also act as KDM1A inhibitors [130]. SNAIL1 is a transcription factor that binds to the KDM1A active site through its SNAG (Snail/GFI) domain with the N-terminal 21 residues adopting a similar conformation to the H3 substrate and acts as a competitive inhibitor. INSM1 (insulinoma-associated protein) is another member of the same family of transcription factors as SNAIL1 and binds to KDM1A with similar affinity. However, crystal structures showed that only the first nine and eight residues of the two transcription factor peptides, respectively, bind in an ordered conformation. Several such small peptides exhibited competitive inhibition and crystal structures of both of these with KDM1A were determined (see Table 2). In addition, novel and potent cyclic peptide inhibitors of KDM1A have been developed [422]; an advantage of cyclic peptides is their significant stability to hydrolysis in plasma.

11.2 Inhibitors of JmjC Demethylases

Many different types of compounds inhibit the JmjC demethylases (Fig. 9b). The majority of JmjC KDM inhibitors identified to date incorporate carboxylic acids/carboxylic acid analogs, leading to use of pro-drug ester forms for sufficient cellular activity. The inhibitors occupy the 2OG binding site and may contain moieties that occupy other potential binding sites such as the region where the methyl-lysine binds. There has been a rapid increase in reports of JmjC KDM inhibitors both in the scientific literature and in patents in the last few years; several excellent reviews and reports of new molecular inhibitor scaffolds have appeared recently [271, 277, 414, 423,424,425,426]. However, many of these KDM inhibitors lack the desired selectivity, potency and pharmacokinetic properties (particularly cell permeability) necessary to be considered as probe molecules for the investigation of individual KDM function in cancer or for development as cancer drugs. There are basically three types of JmjC demethylase inhibitors: 2OG mimetics, compounds that target peptide binding sites, and compounds that interfere with the action of helper domains. We discuss select compounds from each category below, as well as some of the challenges associated with their use.

11.3 TCA Cycle Intermediates and 2OG Mimetics

The development of JmjC demethylase inhibitors is likely to pose challenges with respect to reaching sufficient potency, given the intracellular competition by excess cofactor and cofactor-like compounds. Cancer-associated mutations in tricarboxylic acid (TCA) cycle enzymes lead to abnormal accumulation of TCA cycle metabolites that have been linked to oncogenic transformation. These metabolites are themselves inhibitors of KDMs when they exist in a cell at high concentrations. Mutations to these enzymes are common in tumors and can result in very substantial increases in the concentrations of succinate, fumarate, or 2-hydroxyglutarate (2HG) [427,428,429,430]. 2HG is a five-carbon dicarboxylic acid with a chiral center at the second carbon atom; therefore, there are two possible enantiomers of 2HG: ((S/R)-2HG). Mutations cause isocitrate dehydrogenase (IDH) 1 and 2 to convert 2OG into 2HG, as well as produce 2OG from isocitrate [431]. IDH mutants exclusively produce the (R) enantiomer of 2HG, and the levels of (R)-2HG in IDH mutant tumors can be extremely elevated, ranging from 1 mM to as high as 35 mM [432,433,434]. Succinate, a co-product of the JmjC demethylase reaction, fumarate, and 2HG all inhibit JmjC demethylases, though rather weakly (in the μM to mM range for KDM2A, KDM4A, KDM4C and KDM5B, as shown with isolated proteins and in cells [266, 395, 435]), via competition with 2OG [254, 436]. A number of KDM4A and KDM6C crystal structures with these compounds have been solved (see Tables 5 and 7).

The 2OG analog NOG has generally been used as an inhibitor for in vitro studies [17]. In NOG, the C-3 methylene group of 2OG is replaced with an NH group to give an N-oxalyl amide derivative that likely stalls the catalytic reaction by hindering oxygen binding to the active site iron. NOG has been utilized in many crystallizations of JmjC demethylases, especially those in which peptide is present. Often structures also include a non-catalytic metal ion such as Ni2+, Co2+, or Mg2+ as a substitute for Fe2+. Metal chelating compounds such as diols can also inhibit JmjC KDMs at high concentration. For example, the common buffer TRIS inhibits KDM4C with a Ki = 11 mM and a crystal structure with TRIS has been solved with the compound clearly in the active site when the crystal was grown in the absence of 2OG [275].

Analysis of the X-ray crystal structure of KDM4A in complex with NOG and a trimethylated peptide [253] led to the design and synthesis of NOG derivatives substituted with an alkyl-linked dimethylaniline group in order to mimic the interactions of the trimethylated peptide with the protein [437]. These derivatives maintained the inhibitory action of NOG against KDM4A, and illustrate a strategy of linking the 2OG and peptide substrate binding sites to further increase JmjC KDM inhibition.

11.4 Daminozide and Hydroxamic Acid-Based JmjC KDM Inhibitors

Daminozide is selective for the KDM2/7 families over other members of the human JmjC KDMs [KDM2A (IC50 = 1.5 μM) and KDM7B (IC50 = 0.55 μM)] [268]. Crystallographic studies revealed that daminozide chelates the active site iron via its dimethylamino nitrogen lone pair and C-4 carbonyl group, with its C-1 carbonyl occupying the same site as the 2OG C-5 carboxylate. This selectivity may be engendered by the more hydrophobic region created by the Tyr257, Val255 and Ile191 residues adjacent to the iron ion in the KDM2/7 families compared to the more hydrophobic residues in the corresponding regions in KDM4A and other JmjC demethylases.

11.5 Pyridine Derivatives

A screen using known inhibitors of other 2OG dependent oxygenases identified 2,4-pyridinedicarboxylic acid (2,4-PDCA), which showed potent inhibitory activity on KDM4E (IC50 = 1.4 μM) [262]. The structure of KDM4A (and later other KDMs-see Tables 5 and 6) with bound 2,4-PDCA showed that 2,4-PDCA functions in a 2OG competitive manner. 2,4-PDCA binds the Ni2+ cation in a bidentate manner via its N-atom and 2-carboxylate, whereas the 4-carboxylate mimics 2OG binding by forming two hydrogen bonds with a Lys and a Tyr in the active site. Many compounds have a similar binding mechanism; however, the minimal binding requirements in the 2OG site appear to one atom binding to metal and one binding to the Lys residue in JmjC demethylase binding sites [282].

There are two other inhibitors that are pyridine derivatives and have been studied in greater detail, both biochemically and structurally, than other inhibitors amongst several KDM families: GSK-J1/J4 (4) and KDM5-C49/C70. GSK-J4 and KDM5-C70 are cell permeable prodrug ethyl esters that are hydrolyzed by an esterase within the cell to generate GSK-J1 and KDM5-C49, respectively. GSK-J1 is a potent inhibitor of the H3K27 histone demethylases KDM6A and KDM6B with in vitro IC50 values of 56 and 18 μM, respectively [358]. However, GSK-J1 is a good inhibitor for other KDM families as well, particularly KDM5 [438, 439]. GSK-J1 contains a propanoic acid moiety that mimics 2OG binding, and a pyridyl-pyrimidine biaryl chelates the active site Fe2+, inducing a shift in its position. GSK-J4 is still one of the few inhibitors revealed to have cell activity. GSK-J4 has anticancer effects against acute lymphoblastic leukemia and pediatric brainstem glioma [343, 359], as well as the ability to target ovarian cancer stem cells [360]. GSK-J1 has been crystalized with members of the KDM5 and KDM6 families (Tables 6 and 7).

KDM5-C49/C70 is reported to be a potent and selective inhibitor for the KDM5 family, but is a good inhibitor for the KDM4 and KDM6 families as well [274, 282]. KDM5-C49 is a 2,4-PDCA analog and shows nanomolar inhibitory potencies in enzymatic assays across several JmjC families. KDM5-C70 also lead to cell cycle arrest in a multiple myeloma cells and breast cancer cell lines with an observed increase in global H3K4me3 levels [274, 282].

A pan-inhibitor, JIB-04, was identified in an unbiased cellular screen, and shown to effectively and specifically inhibit several KDM families’ activity in vivo as well as in vitro [440]. Furthermore, JIB-04 could specifically inhibit KDM function in cancer cells, as well as in tumors in vivo. JIB-04 is not a competitive inhibitor of 2OG, and the exact molecular mechanism is unclear. There is relative selectivity of JIB-04 toward KDM5B versus KDM5C in vitro, which correlated with an increased cellular potency overall in vivo and a propensity for cell type specificity not observed with GSK-J4 in one study [54]. High-throughput screening also identified 2,4-PDCA-related 8-hydroxyquinoline compounds as inhibitors of KDM demethylases that were further developed [264, 269, 414]. Natural products such as flavonoids and catechols have been demonstrated to inhibit a number of 2OG oxygenases, including the JmjC KDMs [416, 441].

11.6 A Compound Showing Some Selectivity

Two similar compounds have been crystallized with variant truncated constructs of KDM5A [282, 283]. One compound, CPI-455, is cell permeable while the other, N8, is not. However, both are amongst the most effective KDM5 inhibitors in vitro and are more selective for the KDM5 family than other KDM families. The only difference between the two compounds is a substitution of a methyl group in N8 with a phenyl group in CPI-455 (Fig. 9b). Interestingly, the addition of a methyl group to the phenyl group of CPI-455 to produce CPI-4203 makes this inhibitor less cell permeable and an inactive or very weak control for cell assays [283]. Our lack of understanding of how these small changes make compounds less or more cell permeable reflect our present lack of knowledge of the characteristics required to endow compounds with properties for permeability.

The position occupied by these inhibitors (Fig. 10c) completely overlaps the binding site of 2OG, demonstrating a competitive mode of action, as suggested by biochemical assays [282, 283]. The nitrile group of these KDM5 inhibitors makes a single interaction with the active-site metal ion, while a ring nitrogen atom forms a hydrogen bond with the side chain of Lys501. The carbonyl oxygen off the ring is within hydrogen bonding distance to the side chains of Asn575, as well as Lys501. In the KDM5 structure with 2OG, the side chain Asn575 bridges between Lys501 and Tyr409, which form hydrogen bonds with the carboxylic group of 2OG. In contrast to the structure with 2OG, the side chain of Tyr409 is pushed away by the bulky pyrazolopyrimidine ring and is rotated nearly 90° from that of the 2OG-bound form, resulting in a van der Waals contact with the isopropyl substituent on these compounds [54]. In the CPI-455 structure, Tyr409 as well as Arg73 is pushed even further away from the active site because of the phenyl ring substitution of the methyl group in N8. In both structures, the central pyrimidine ring sandwiches between the aromatic rings of Tyr472 and Phe480. The phenyl group in CPI-455 forms an edge-to-face aromatic contact with Tyr409 and points toward solvent.

Fig. 10
figure 10

Crystal structures containing (a) GSK-J1, (b) KDM5-C49 and (c) compounds N8 and CPI-455

All amino acids within 4 Å of the inhibitor are conserved in the KDM5 family; hence, N8 and CPI-455 inhibit all KDM5 family members. The selectivity of these compounds for KDM5 versus KDM2, KDM3 and KDM6 proteins derives from conformational and sequence differences within their active sites. For instance, KDM6B is more constricted in the region flanking the phenyl and isopropyl substituents off the pyrazolopyrimidine ring of these compounds. The scaffold of CPI-455 is being further developed by improving the interactions with the Tyr409 side chain with modifications of both the isopropyl and phenyl groups to improve inhibition of KDM5 demethylases and the cell potency of the compound [442].

11.7 Inhibitors to Substrate Binding Regions

An inhibitor of G9a methyltransferase (BIX-01294) and its analog E67 also inhibited the human H3K9me2 demethylase KDM7A [396]. These compounds act as H3 substrate analogues and therefore, both enzymes can recognize methyl-lysine residues either as product or as substrate. Compound E67 was shown to inhibit KDM7A and KDM7B with IC50 values in the low-micromolar range in an in vitro mass spectrometric demethylation assay, but was inactive against KDM5C. E67 exhibited cytotoxicity at concentrations around 50 μM against mouse and human primary fibroblasts. A crystal structure confirmed binding of this compound to the active site of the enzyme [396]. A compound that mimics both Lys and 2OG was synthesized which appeared to selectively inhibit KDMs [443]. In addition, its prodrug methylstat selectively inhibited JmjC demethylases in cells and could inhibit cell growth of an esophageal carcinoma cell line.

11.8 Inhibitors to PHD Binding Helper Domains

Helper domains of JmjC KDMs can be tractable targets and provide promising leads for development of inhibitors targeting noncatalytic domains of JmjC KDMs. For instance, small molecule inhibitors targeting the PHD3 domain of KDM5A were identified through application of an assay that uses 96-well polystyrene plates activated with synthetic ligand for covalent and oriented capture of a protein fusion to KDM5A PHD3 which allowed screening for molecules that displaced histone H3K4me3 binding to PHD3 [444]. Screening of the NIH Clinical Collection 1 library identified compounds such as disulfiram, phenothiazine, aminodarone, and tegaserod maleate as inhibitors (Fig. 10a). Disulfiram inhibits KDM5A PHD3 and other PHD fingers not by acting as a ligand, but through ejection of structural zinc, thus revealing a general susceptibility specific to PHD fingers as a histone reader domain. The compounds were further tested through affinity pull-downs, fluorescence polarization, and histone reader specificity studies. Inhibitors based on aminodarone derivatives were identified to be potent against KDM5A-PHD3, with IC50 values in the 25–40 μM range [444].

12 Conclusions

Crystal structures of catalytic domains exist for every human KDM family. Additionally, there is a quickly growing number of structures of these domains with inhibitors containing different chemical moieties in the active site. However, there is a substantial need for developing new types of inhibitors, likely aided by improving our understanding of all of these structures. Because we know that some catalytic domains are inactive when expressed alone, these structures need to be further supplemented by solution studies and greater biochemical analysis of KDM selectivity.

There are still much to learn about these demethylases. For instance, very little is known about large parts of some KDM demethylases, such as the second half of the KDM5 and KDM7 families. Discovery is just beginning on how KDM domains interact with each other and how these domains interact with other proteins in multicomponent complexes. Recent advances in single-particle cryo-electron microscopy (cryo-EM) may aide in this regard [445]. These advances are enabling generation of numerous near-atomic resolution structures for well-ordered protein complexes with sizes ≥200 kDa. Cryo-EM should allow structure determination of the large KDMs with all their domains, complexes with their interacting proteins and nucleosomes as well as information about the dynamic conformational states of these domains and complexes.

Future detailed structural information from both X-ray crystallography and cryo-EM will offer further understanding about the molecular basis of histone demethylation, i.e. how demethylases exert their substrate specificities and function in histone regulation. In turn, this will allow better development of inhibitors, which may potentially be utilized as drugs in mankind’s battle against various cancers where demethylases play a substantial role.