Abraham Wald’s “missing bullet hole” genius applies to the CMG helicase

While working for a classified program called the Statistical Research Group in Manhattan during the Second World War, the mathematician Abraham Wald was tasked with determining the best locations on a warplane to increase armor to prevent combat loss from dogfights or ground fire [1]. Placing additional armor on a plane rendered it heavier and less agile, creating a liability in need of careful consideration. Military officers reasoned that the preponderance of bullet holes on the fuselage and wings of aircraft returning to base suggested those were the most vulnerable areas in need of such armor. But the mathematical logic in Wald presented the military with an unorthodox answer: the planes should have their armor increased where the bullet holes were not observed, namely around the engine [1]. His reasoning was that planes with damage to their fuselages and wings returned to base, arguing that those sites of damage, while debilitating, were survivable. However, the lack of planes returning with significant bullet holes against their engine compartments indicated that the engine was a major point of vulnerability that reduced survivability of the planes and flight crews when targeted with gunfire. It was the statistical absence of bullet holes, the “missing bullet holes,” that defined the best location to bring down an aircraft [1].

A type of analogy to Wald’s missing bullet holes and engine vulnerability exists when assessing the roles of the replicative Cdc45-MCM-GINS (CMG) helicase as a driving mechanism underlying cancer development, while also defending the worthiness of the CMG helicase as a druggable target for anti-cancer intervention. Defining the cell cycle and DNA replication as a metaphorical “engine” in propelling a cell forward during proliferation, the presence of mutated or overexpressed oncogenes that drive the cell cycle engine during tumorigenesis is often used to identify potential tumor-specific dependencies. The mutations in such altered genes are analogous to the bullet holes in aircraft assessed by Wald, and are often used to define where one should focus anti-cancer drug development efforts. As will be described in more depth below, CMG helicase activity is central to tumorigenesis and the cell cycle engine, yet human malignancies rarely, if ever, contain mutated CMG components that represent cancer-driving situations [2, 3]. As with Wald’s engine situation, where damage to the engine is not compatible with the survival of aircraft [1], it appears likely that despite elevated mutation rates within tumors, tumor cells cannot survive if they mutate CMG components in ways that reduce or eliminate function. It is this “missing mutation” status of the CMG helicase as a needed survivability factor and lack of cancer-driving CMG mutations, together with mismanagement and alteration of CMG assembly and activation by oncogenic drivers [3,4,5,6,7], that renders the CMG a “never-mutated” tumor-specific vulnerability and justifiable target for anti-cancer drug development.

The CMG helicase: assembly and activation

The replicative helicase is a large enzyme composed of 11 primary subunits, including Cdc45, the tetrameric GINS complex named for its subunits (Go-Ichi-Ni-San, for 5-1-2-3 in Japanese; subunits Sld5, Psf1, Psf2, Psf3), and the MCM2-7 hexameric ATPase core of Mini-Chromosome Maintenance proteins (Fig. 1). Using these subunits as the naming basis (Cdc45-MCM-GINS), the replicative helicase is referred to as the CMG helicase, or simply the CMG [8,9,10,11]. This review will focus on the relevance of the replicative CMG helicase to human cancer by discussing important aspects of CMG assembly and regulation by signaling pathways and how problems with CMG management are involved in cancer development. For an in-depth understanding of the molecular and biochemical details of the initiation or elongation phases of DNA replication, the reader is referred to some elegant research reports and reviews on these topics [9, 10, 12,13,14,15,16,17,18,19,20,21,22,23,24]. In addition, to complement the discussion here, the reader may wish to examine several other reports on the roles of CMG components in cancer for additional insight [25,26,27,28].

Fig. 1: Structure and assembly of the replicative human CMG helicase.
figure 1

A, B Space-filling and ribbon structures of the human CMG helicase obtained from cryo-electron microscopy (cryo-EM) (PDB accession code 6XTX). C Diagrams showing CMG helicase assembly steps. Double MCM hexamers are loaded first onto DNA, and near the G1/S transition the Cdc45 and GINS subunits are recruited to form double CMGs. Initiation at G1/S leads to establishment of dual CMG helicases on ssDNA, moving in opposite directions within replisome complexes. General locations of DNA polymerases, topoisomerase II, Mcm10, and Ctf4 are shown. Note that during S-phase late-firing DNA replication origins proceed through similar CMG establishment steps. See text for details and references.

The CMG helicase functions during DNA replication to unwind the DNA at replication forks ahead of the DNA polymerases that catalyze the generation of new daughter DNA strands [14,15,16, 18, 19, 29,30,31,32]. The CMG also mediates the initial melting of double-stranded DNA (dsDNA) at replication origins where DNA replication begins in a bidirectional manner [16, 18, 23, 33]. The CMG is the only replicative helicase enzyme that catalyzes these particular melting and unwinding steps during DNA replication, as there are no known enzymes in cells that can replace its function in these processes. The larger multi-protein complex tasked with duplicating the DNA at replication forks during the synthetic S-phase is called a replisome, composed of the CMG helicase, DNA polymerases and primases, topoisomerases, and other associated factors [11, 22, 34,35,36,37]. The CMG helicase is also required for cells to recover DNA replication after encountering fork-stalling stress, or replicative stress, during S-phase [3, 38,39,40,41]. This role of the CMG helicase in recovering from replicative stress is highly relevant to how CMG problems contribute to cancer and requires an understanding of how the CMG helicase is assembled and regulated.

CMG assembly occurs in G1 phase and begins with the recruitment of MCM hexamers to DNA (within chromatin) by the concerted actions of the Origin Recognition Complex (ORC), Cdc6, and Cdt1 (Cdc10-dependent transcript-1) [12, 13, 42,43,44,45,46,47,48,49,50,51,52,53,54]. ORC is analogous in function to Initiator proteins in prokaryotic and viral organisms that bind replicator sequences at DNA replication origins [called ori’s, or autonomous replicating sequences (ARS) in yeast] to facilitate loading of replisome proteins, including the helicase [47, 53, 55,56,57]. However, in higher eukaryotic cells ORC does not have a DNA site-specific binding requirement, instead being influenced in its DNA interactions by general DNA sequence composition (prefers AT-rich) or limited by torsional DNA stress [58,59,60]. Consistent with this, and as discussed more below, mammalian cells do not possess specific DNA replicator sequences at origins [58, 59, 61, 62]. ORC and Cdc6 contain ATPase domains that are involved in the coordination of MCM loading, and MCM loading onto DNA (within chromatin) requires ATP binding and hydrolysis by the MCM subunits [13, 20, 42, 53, 63,64,65,66]. Two MCM2-7 hexamers are loaded by ORC onto DNA in a head-to-head manner encircling the DNA (Fig. 1), with their amino-terminal ends facing each other and the carboxy-terminal ends containing ATPase domains facing outward [12, 13, 20, 23, 24, 42]. The loaded MCM hexamers are referred to as a pre-Replication Complex, or pre-RC [44, 45], and MCM loading is also referred to as licensing DNA for one round of DNA replication [42, 67,68,69,70,71,72].

MCM loading also requires functions of Cdc6 and Cyclin E, and for the latter, there are kinase-dependent (with Cdk2) and kinase-independent roles [45, 48, 66, 73,74,75,76,77,78]. Cyclin E-Cdk2 is recruited to MCM loading sites through interactions with Cdc6 [76, 77, 79], which is also a substrate of Cyclin E-Cdk2 kinase activity [76, 77, 80]. Phosphorylation of Cdc6 prevents its degradation by APC/C-dependent proteolysis [81,82,83]. In the absence of its associated kinase, Cyclin E also interacts with MCM subunits such as Mcm7, and with Cdt1, both events being required for MCM hexamer loading [76,77,78]. These necessary interactions of Cyclin E and its associated kinase during MCM recruitment function upstream or coincident with the functions of Cdt1 in MCM loading [76, 77]. Cyclin E-Cdk2 kinase can phosphorylate Mcm3 and Mcm7 in vitro [84, 85], and phospho-blocking mutation of one targeted site in Mcm3 (T722A) reduces Mcm3 chromatin binding [84], suggesting that phosphorylation of Mcm3 by Cyclin E-Cdk2 is important in MCM assembly. Similarly, a mutation in Mcm7 that blocks phosphorylation of a putative Cyclin E-Cdk2 site (Mcm7-S121A) results in reduced interactions between Mcm7 and Mcm3-Mcm5-Cdc45 complexes [85], suggesting a need for Cyclin E-Cdk2 phosphorylation of this site in pre-RC assembly. However, the Mcm7(S121A) mutant protein appears to be more capable of binding chromatin compared to wildtype Mcm7 [85], suggesting that phosphorylation plays a more complex role in MCM regulation. Indeed, as discussed below, overexpression of Cyclin E causes a genome-wide destabilization of MCM hexamers on chromatin [7]. Xenopus and human Mcm4 protein has been shown to be a substrate of Cdc2 (Cdk1) and Cyclin A-Cdk2 kinases, respectively [86, 87], and phosphorylation by these kinases reduces chromatin association of MCM complexes, part of a process to prevent re-licensing of DNA in late S-phase and G2/M phases [86,87,88]. Although not yet determined, it is possible that deregulation of Cyclin E-Cdk2 (due to Cyclin E overexpression) in mammalian cells might target some of these phosphorylation sites in Mcm4 or other MCM subunits, causing the reduced MCM hexamer chromatin affinity that is observed [7]. Interestingly, for both Mcm3 and Mcm7, overexpression of wildtype proteins causes a block to S-phase entry and checkpoint activation [84, 85], indicating that overexpression of single MCM subunits is not tolerated by mammalian cells. Cyclin E-Cdk2 also phosphorylates Treslin, the homolog of the yeast Sld3 protein, which facilitates interaction with TopBP1 (DNA topoisomerase II binding protein 1; homolog of yeast Dpb11) and promotes recruitment of Cdc45, GINS, and DNA polymerases to chromatin [89,90,91]. Although some of these Cyclin E-Cdk2 mediated events in MCM and CMG assembly are known, a complete mechanistic picture of the contribution of Cyclin E and Cyclin A associated kinases in regulating of MCM/CMG function awaits further investigation.

During the cell cycle, Cyclin E-Cdk2 becomes active in middle-late-G1 phase, which fits nicely with the middle-late-G1 timing of when MCMs are loaded onto chromatin in mammalian cells released from quiescence [92,93,94,95,96,97,98]. Another target of Cyclin E-Cdk2, the Rb tumor suppressor protein, also becomes increasingly phosphorylated coincident with MCM loading, indicating that Cyclin E-Cdk2 (and Cyclin E) dependent MCM loading occurs close to, or at, the Restriction Point of the cell cycle [92, 98, 99]. As will be described below, Rb (and hyperphosphorylated Rb) also plays a role in regulating MCM and CMG function in late-G1 phase. Collectively, these results suggest that, in cells released from quiescence, passage through the Restriction Point (R-Point) parallels MCM loading and licensing of DNA for one round of DNA replication [92]. Interestingly, in cycling mammalian cells (without an intervening quiescent period), MCMs load onto chromatin even earlier, during late mitosis after chromosome separation [100]. The latter indicates that under continuous cycling conditions, the many factors required for MCM loading are present and active prior to or after daughter cells are created.

Cdt1 has no enzymatic domains but plays pivotal roles in the MCM assembly process. Cdt1 contains a carboxy-terminal Mcm6 interacting domain that is required for making contacts with the MCM complex during loading [101,102,103,104]. A small protein called Geminin inhibits Cdt1 function to block MCM assembly, and Geminin achieves this in part by inhibiting the Cdt1-Mcm6 interaction and Cdt1 DNA binding [105,106,107,108,109]. Cdt1 also regulates MCM loading through interactions with chromatin-modifying enzymes, including a histone acetyltransferase (HAT) and histone deacetylase (HDAC) [110, 111]. In G1 phase Cdt1 binds to HBO1 (HAT binding ORC1) to facilitate localized chromatin decondensation and MCM loading onto accessible DNA, which is suppressed by Geminin [110,111,112,113,114]. Once cells enter S-phase and DNA replication begins, Cdt1 interacts with HDAC11 to promote chromatin closure and block further MCM loading to prevent another round of licensing [110]. The interaction between Cdt1 and HDAC11 is enhanced by Geminin binding [110]. Demonstrating its pivotal role in MCM loading and licensing using these mechanisms, overexpression of Cdt1 (during S-phase) can cause cells to re-license their DNA for another round of DNA replication, causing genomic (chromosomal) re-replication within a single cell cycle [106, 110, 115]. In addition, as described more below, Cdt1 overexpression can abrogate growth-inhibitory signals of TGFß1 under certain conditions and force MCM loading when it would normally be blocked [77]. Cdt1 thus appears epistatic to most of the events regulating MCM loading, being able to achieve MCM re-loading as a single deregulated factor. Accordingly, Cdt1 is oncogenic [116, 117], and its role in promoting excessive MCM loading likely contributes to tumorigenesis by creating genomic destabilization [106].

The conversion of a pair of MCM hexamers to a pair of CMG helicases (double CMGs; dCMGs) requires recruitment of Cdc45 and GINS, which occurs near G1/S and at future origins that fire later in S-phase [11, 18, 23, 24, 31, 36] (Fig. 1). Metazoan Cdc45 recruitment requires combined Cdk2 and Cdc7-Dbf4 (DDK) kinase activities, and PP2A phosphatase activity [24, 118,119,120,121,122,123]. GINS recruitment requires Cdc7-Dbf4 and Cdk2 [24, 123]. In yeast, although Cdc45 recruitment to MCMs does not require Cdk2, stable interaction of Cdc45 within CMGs requires GINS and Cdk2 activity [123]. Interestingly, in quiescence-release mammalian cell models CMGs can begin assembly in late-G1 several hours prior to G1/S based on the chromatin association of MCMs and Cdc45 (GINS loading kinetics was not assessed), yet the CMG remains enzymatically inactive until G1/S in terms of processive unwinding of DNA [77, 92, 124]. Yeast cryo-electron microscopic studies have demonstrated that when MCM pairs are converted to two opposing dCMGs prior to, or coincident with, the G1-S transition, changes to CMG-CMG and CMG-DNA interactions produce an ATP-dependent localized destabilization of the DNA within MCM cores of the dCMGs such that a few base pairs are melted and stabilized by each Mcm2 protein [23, 24]. This step likely represents one of the earliest events in the initiation of DNA replication.

Once triggered to (further) melt and unwind DNA at G1-S by mechanisms that remain unclear in mammalian cells, CMG pairs pass each other (Fig. 1) and CMG movement within replisomes at replication forks relies on an orchestrated set of ATP hydrolysis steps within the MCM hexameric “core” of the CMG [9, 10, 14, 15, 19]. Neither Cdc45 nor GINS possess enzymatic domains, but their association with the MCM hexamers increases CMG helicase activity upwards of 300-fold, as measured using ATPase and fork-unwinding assays in vitro [10]. The CMG moves along single-stranded DNA (ssDNA) in a 3’-5’ direction (Fig. 1) using ATPase-driven allosteric changes to MCMs as they interact with incoming bases on the ssDNA in the central MCM channel [10, 19, 30, 36, 125]. The ssDNA passing through the CMG central channel is copied by the leading strand polymerase, DNA Polymerase-ε [125]. The lagging strand of DNA is created at the front of the CMG, where the combined efforts of amino-terminal MCM domains and Cdc45 and GINS separate the double-stranded parental DNA [15, 36, 125, 126]. Ctf4 mediates interactions of the CMG with the lagging strand polymerase, initially DNA Polymerase-α-Primase, then transitioning to DNA Polymerase-δ, but Ctf4 also displays some enhancements to CMG processivity on its own in specific in vitro assays [18, 36, 127]. The CMG itself appears to play an important role in managing the distribution of DNA polymerases between leading and lagging strands, as mutations in GINS reduce recruitment of DNA Polymerase-ε to the leading strand and necessitate use of DNA Polymerase-δ instead [128]. Also interacting with CMGs and facilitating CMG and replisome function during DNA replication is a protein called Mcm10, which stabilizes replication forks and manages replicative stresses [24, 129,130,131,132,133,134,135,136]. In some cancers the Mcm10 locus is amplified and Mcm10 protein overexpression is evident, suggesting Mcm10 may play a role in promoting cancer growth through CMG interactions and changes to CMG or replisome function [129]. During CMG unwinding at forks, Cdc45 also recruits histone-modifying enzymes, including Cdk2, to facilitate replisome movement through higher-order chromatin that is modulated by histone-H1-dependent interactions and compaction [137].

An important point to understand from this discussion is that the MCM hexamer pairs that are converted to double CMG helicases and participate in replisomes represent the starting sites of DNA replication in mammalian cells. These MCMs/CMGs are thus the actual origins of DNA replication (analogous to ori’s) even though specific DNA replicator sequences generally do not define such starting sites in mammalian cells.

Reserve MCM “dormant origins”: keys to replication fork management, responses to replicative stress, and genomic stability

A region of a chromosome that is copied bidirectionally from an origin of DNA replication, now defined as the location where a CMG pair becomes activated, is called a replicon. Perhaps paradoxically, mammalian (and other eukaryotic) cells load significantly more MCM hexamers onto chromatin/DNA, on average, for each replicon than are needed to be converted to a single pair of functioning CMG helicases [40, 41, 52, 138]. Many of these extra MCM hexamers that are loaded are derived from nascently-synthesized MCMs prior to the separation of daughter cells during mitosis, and have not functioned in DNA replication prior to their chromatin accumulation in the next G1 and S-phase [139]. These excessive MCM hexamers that are loaded serve at least three purposes. First, MCM hexamers distributed throughout the chromosomes represent the aforementioned licensing step that marks unreplicated DNA and allows (and ensures) DNA replication to occur once, and only once, in each cell cycle. Second, mammalian cells are “smart” in knowing they will encounter issues that lead to replication fork stalling, also called Replicative Stress (RS). These fork-stalling issues could include heterochromatin, topological constraints, transcription interference, DNA mutation-repair events, chemotherapy, or radiation leading to DNA breaks. Importantly, these excessive MCM hexamers provide a means to recover DNA replication after fork-stalling events, leading to their classification as reserve MCMs, or dormant origins [38,39,40,41]. Finally, using mechanisms that remain poorly understood, the excessive MCMs modulate and restrict replisome fork speeds during S-phase to prevent DNA damage and genomic instability [139].

Mammalian cells load 5–10× the number of MCM hexamers than are required to finish an unperturbed S-phase [138]. Estimates derived from quantitative immunoblotting techniques have suggested that for each 100 kb of mammalian DNA, on average, there are ~4–6 MCM hexamers loaded, meaning ~2–3 MCM hexamer pairs (maximal estimates) [138]. Replicons in mammals range in size, from perhaps 20 kb to as high as several hundred kilobases [140, 141]. Thus, a larger 300-kb replicon is predicted to contain as high as ~6–9 MCM hexamer pairs even though only one pair of MCM hexamers is required as an origin within the replicon [138]. Regardless of precise replicon size and MCM loading distribution within replicons, which is likely stochastic in nature in different parts of the genome, it is clear that there is an excess of reserve and required MCM hexamers. Reduction of MCM levels by 80-90% using siRNA-mediated methods, which does not alone hinder cell growth, but depletes the reserve MCMs, results in loss of cell viability and signs of increased DNA damage when RS is induced by fork-stalling drugs [38,39,40,41]. The reserve MCMs are needed to function as CMG helicases to resume DNA replication and recover or complete S-phase, thus maintaining genomic stability after RS and DNA damage [38]. Under unperturbed S-phase conditions or in the absence of RS (i.e., no drugs), reserve MCMs do not act as origins of DNA replication (only licensing and fork speed regulators/suppressors) and are thus dormant origins [38, 40, 41]. Evidence indicates that the ATR and ATM proteins mediate suppression of these dormant origins, regulating origin usage until stalled fork stress or DNA damage is encountered [142,143,144,145,146].

Although reserve MCMs are available to function as CMG helicases under replicative stress conditions, it is accurate to say that we currently do not know whether reserve MCMs are simply hexamers on DNA lacking Cdc45 and GINS, or instead are fully formed CMGs containing Cdc45/GINS that are inactive until an RS event occurs. MCMs are loaded onto chromatin in excess (~4–6 MCM hexamers/100 kb), but the total protein abundance of Cdc45 in mammalian cells (~0.35 molecules total/100 kb, 25–30,000 molecules/cell) is extremely low [138]. In addition, mammalian Cdc45 is rate limiting for CMG formation [138]. At this time, we are not aware of whether GINS levels have been determined. For these reasons, it is more likely that reserve MCM hexamers do not contain Cdc45/GINS due to low stoichiometric levels of such proteins (minimally Cdc45), and MCM-to-CMG conversion occurs only when RS requires activation of reserve helicases. Regardless of the timing of Cdc45/GINS recruitment, going forward, we will refer to reserve helicases as “reserve MCMs” or “reserve CMGs”, as the distinction is not necessarily relevant to further understanding of the role of reserve CMG helicases in cancer.

The extra MCMs loaded onto mammalian chromatin are not just reserves for recovering from RS conditions. Evidence shows that DNA replication fork/replisome speeds are suppressed or modulated by the extra MCMs [139]. Situations that reduce the presence of the extra MCMs do not always hinder the response to RS, but do result in replication fork speeds being increased 20–25% relative to the normal condition in which all extra MCMs are sufficiently loaded prior to S-phase [139]. Although the biochemical mechanisms underlying this ability of reserve MCMs to regulate fork speeds remain unknown at this time, it has been suggested that the extra MCMs may serve as a sort of roadblock, or rate-limiter, for forks established by other MCM/CMGs, sterically limiting replication forks when reserve MCMs are encountered [139]. The steric limitations may also involve time required to disassemble and remove the dormant extra MCMs and unlicense that region of the chromosome [139]. However, other mechanisms involving molecular signaling between MCMs and CMGs (directly or using soluble factors) cannot be ruled out. Loss of control of replication fork rates with reduced MCM availability results in the presence of asymmetrical replisomes and DNA damage, leading to genomic instability [139]. Intriguingly, a reduction of the normal levels of extra MCMs is predicted to create a genome-destabilizing compounding effect: increased fork speeds and DNA damage during S-phase that then requires the presence of reserve MCMs/CMGs to facilitate recovery from the RS and DNA damage induced by the deregulated fork speeds. Application of this concept to tumorigenesis is described below.

A model emerges in which replicons load a large number of MCM hexamers (around the R-point), but only a few are chosen in late-G1 to be converted to CMG helicases that eventually become activated at G1-S to replicate DNA during S-phase (Fig. 2, MCM pairs in orange, CMGs in color). The reserve complement of extra MCMs “waits” for replicative stress to signal their conversion to CMGs for recovery, and also regulates replication fork speed to prevent DNA damage [139]. Intriguingly, this aligns nicely with how DNA replication origins in mammalian cells have been shown to function. Origins are not rare specific DNA sites defined by replicators, but instead are zones of potential replication start sites [61, 62, 147,148,149,150,151,152], or many potential origins where one origin is chosen. Experiments have shown that MCM proteins are indeed loaded into mammalian origin zones in a distributed manner during G1 phase [153], suggesting that ORC loads excessive MCM pairs in a stochastic manner within these zones. Based on the modeling from the above discussion, it appears then that any pair of these loaded MCMs can be converted to a pair of CMG helicases. What specifies a particular MCM pair to be converted to a CMG pair is not known, but it has been shown that parental MCMs that functioned in some manner in the previous S-phase prior to daughter cell creation at mitosis are preferred in the next S-phase for conversion into CMGs [139]. Regardless of the mechanism, the CMG pair activated then defines the origin in that replicon of that cell. The stochastic nature of this process occurring in a population of cells would be seen experimentally as a zone of potential origins, derived from a zone of MCM paired-hexamers, with different MCM pairs randomly chosen in different cells to become CMG helicases within the zone.

Fig. 2: Mammalian cells load excessive MCM hexamers onto DNA for multiple purposes.
figure 2

Mammalian cells load an excess of MCM pairs (shown in orange as hexamer pairs) onto DNA, with a small number of select MCM pairs being converted into replicative CMG helicases (near G1/S transition). The extra MCMs serve at least three purposes (shown with red text/arrows). The first role is to mark unreplicated DNA, thus licensing DNA for one round of DNA replication. Second, the extra MCMs regulate replication fork speeds of those MCMs that are converted into CMG helicases during S-phase, maintaining a fork rate that prevents DNA damage. Third, following replicative stress (RS), such as fork-stalling or DNA damage, excess reserve MCM pairs can be converted into active CMG helicases for recovery of DNA replication, likely after DNA repair. See text for details and references.

Why would mammalian cells devise such a plastic and non-specific system for initiating DNA replication from random sites/origins, versus from specific replicators that likely require use in every cell cycle? Indeed, although the extra MCMs are clearly involved in licensing, fork speed regulation, and recovery from replicative stresses, mammalian cells could have evolved these genome-stabilizing mechanisms using other factors or methods (other than relying on MCMs), while demanding replication to begin at specific non-redundant sites. One possibility is that this plasticity allows DNA replication to start and complete from any and all regions of the mammalian genome regardless of the transcriptome that is present. Active transcription in a given region can be an opposing force to DNA replication in mammalian cells, with some studies showing that MCM loading and initiation of DNA replication are reduced in actively transcribed genomic regions (i.e., initiation zones have been shown in some cases to be intergenic) [61, 153,154,155]. Perhaps this MCM/CMG plasticity accommodates differences in gene expression during development and differentiation, guaranteeing flexible starting and completion of DNA replication regardless of any limits imposed by transcription or other issues that might interfere with specific replicators in different developing cell types. However, such a model for MCM preferential loading in intergenic regions due to low transcription activity is likely an oversimplification. Active transcription can occur in intergenic regions [156, 157], and transcription itself plays an active role in stimulating DNA replication origin activity or regulating the timing of initiation [158,159,160,161,162]. In fact, differential transcriptional programs in different cell types have been shown to influence origin firing locations, which may also contribute to flexibility of DNA replication initiation (or ensuring initiation) within various transcriptomes [158]. Clearly, more work needs to be done to understand the molecular mechanisms controlling where and how MCMs are ultimately loaded onto DNA in mammalian cells. Regardless of such specific details, the excess MCM/CMG flexibility supplies a needed set of reserve dormant origins that can suffice for completion of DNA replication should unanticipated (or anticipated) problems occur during S-phase, while also managing replication fork speeds. To use the colloquial, the mammalian cell has devised a means to “hit multiple birds with one stone”, the stone being the excessive MCMs necessary for many tasks.

The CMG is a target of TGFß1 growth arrest and the Rb tumor suppressor

CMGs are not passive bystanders in the cell cycle simply waiting to function in a replisome when the time arises. If one defines S-phase as the start of DNA synthesis by polymerases, and late-G1 phase as the time during which double CMGs assemble (and potentially begin localized DNA melting [23]), then G1-S can be defined molecularly as the point in time when CMGs are activated to unwind DNA in a processive manner within replisomes. This would indicate that all stimulatory growth factor signals must ultimately regulate CMG function, which has been illustrated in the above discussion in terms of positive influences of such signals on CMG assembly/activation arising from the cell cycle machinery. Conversely, inhibitory growth factor signals must prevent CMG function, and this is apparent when assessing the mechanisms used by Transforming Growth Factor ß1 (TGFß1) to achieve cell cycle arrest. TGFß1 is a potent growth-suppressive factor for epithelial, endothelial, and many immune cells that inhibits multiple events in the cell cycle to block progression through the G1-S transition [77, 163,164,165]. This growth-suppressive ability of TGFß1 is often lost in human malignancies due to abrogation of the molecular signaling events that mediate TGFß1-induced cell cycle arrest [166, 167]. As such, understanding the mechanisms utilized by TGFß1 to block cell growth can identify molecular targets that are critical in promoting the growth of tumor cells and could thus serve as a focus for drug discovery efforts aimed at blocking tumor progression.

Well-known targets of TGFß1 signals include suppression of Myc and cyclin expression, and inhibition of Cyclin E/Cdk2 and Cyclin D/Cdk4 kinase activities [77, 166,167,168,169,170,171]. TGFß1 also targets proteins that control Cdk function, including suppression of Cdc25A phosphatase and induction of multiple cyclin-dependent kinase inhibitors (CKI) such as p15INK4B, p21Cip1, and p27Kip1 [172,173,174,175,176,177]. Suppression of these kinases prevents phosphorylation of the retinoblastoma protein (Rb), which normally allows cells to progress into S-phase [77, 124, 165, 178, 179]. Important in these events is the role Rb plays in mediating growth arrest by TGFß1, being a collective target of all of these inhibited proteins and enzymes that TGFß1 negatively regulates. However, studies have shown that Rb is not always necessary for TGFß1 to achieve cell cycle arrest, and growth inhibition by TGFß1 can occur in the absence of Myc suppression or in cells lacking the aforementioned CKIs [77, 124, 172, 180,181,182,183]. Added to this, overexpression of Myc alone can override these inhibitory events targeted by TGFß1 and promote G1-S transit, indicating that Myc is epistatic to the negative effects of TGFß1 and positively affects a factor(s) in G1 phase necessary for entry into S-phase [184,185,186]. The mechanisms mediating TGFß1 inhibition of CMG helicase assembly and activation can provide explanations for these findings.

The effects of TGFß1 on the CMG depend on the timing in G1 phase when cells are exposed to TGFß1, and the status of the Rb protein (Fig. 3). In cells expressing Rb, TGFß1 exposure in early G1 blocks MCM expression, along with suppression of Myc expression, and inhibition of Cyclin E/Cdk2 activity later in G1 [77, 124, 165,166,167,168, 170, 171, 178]. The absence of MCM expression will block progress through G1, and lack of Myc and Cyclin E/Cdk2 activity are guarantors of no G1-S transit [77, 124, 165, 168, 170, 178]. However, when Rb-containing cells reach late-G1 and are exposed to TGFß1, these events in early G1 have passed and CMGs have assembled (or are assembling). In late-G1, TGFß1 relies on Rb to bind and inhibit CMGs from becoming active at G1-S, and the more phosphorylated form of Rb can do this [124].

Fig. 3: Transforming growth factor-ß1 cell cycle arrest signaling pathways target multiple aspects of MCM assembly and CMG activation.
figure 3

Many of the conventional targets for TGFß1-induced growth arrest, including Myc and Cyclin E-Cdk2 (and Cdk-inhibitors, CKIs), regulate steps involved in MCM/CMG functionality. TGFß1 targets and suppresses Myc, Cyclin E-Cdk2, and MCM expression and abrogates CMG assembly/activation, thereby using a pleiotropic set of inhibitory signals on the CMG to ensure a block to cell cycle progression. TGFß1 also targets MCM hexamer assembly in middle G1, or CMG activation at G1-S, depending on the RB status of the cells (lower right).

Curiously, work performed by our group in two related studies has shown that in certain mammalian cells lacking Rb (mouse keratinocytes, MK) these TGFß1-induced mechanisms controlling the CMG are upended. Myc, Cyclin E/Cdk2 kinase, and MCM expression are no longer inhibited by TGFß1 exposure in early G1 when Rb is missing, and in late-G1, without Rb, TGFß1 cannot inhibit any CMGs that do form [77, 124]. Nonetheless, despite these severely debilitating effects on TGFß1’s ability to block cell growth in the absence of Rb, TGFß1 continues to inhibit the cell cycle, albeit only when added to cells that are in early G1 [77, 124]. While future work by other groups may expand upon these findings with additional insight, one mechanism identified involves an ability of TGFß1 to inhibit MCM hexamer assembly/loading, derived at least in part from the suppression of a Cyclin E-Mcm7 interaction and reduced association of active Cyclin E/Cdk2 kinase with chromatin [77]. Overexpression of Cdt1 can override these particular TGFß1 arrest mechanisms in Rb-lacking MK cells, restoring MCM assembly and G1-S transit [77]. This would suggest that another oncogenic role for Cdt1 may include abrogation of growth-inhibitory TGFß1 signals in cells lacking Rb.

The mechanisms by which Rb binds and inhibits the CMG provide answers to some questions regarding how Rb regulates G1-S transit and plays an important role in tumorigenesis. The combined work from our group and others has found that Rb uses a bi-partite mechanism to block the start of DNA replication: inhibition of the CMG and abrogation of Ctf4 and DNA polymerase interactions with the CMG/replisome [32, 187, 188]. These functions of Rb derive from specific exons located in the amino-terminal half of Rb (RbN). A domain in RbN comprised of exon5/6 (called the Projection) interferes with Ctf4/Polα recruitment to replisomes/CMGs, while exon7 (Ex7) inhibits the CMG [187, 189]. These exons are often lost in familial inherited retinoblastomas that are lower penetrance diseases, compared to higher penetrance retinoblastomas derived from mutations or loss of the carboxy-terminus of Rb where the E2F proteins are known to interact [187, 189,190,191,192]. Importantly, both of these RbN exon domains are additive and independent in their ability to control DNA replication, but are only individually lost in a particular inherited cancer lineage [187, 189, 190]. Thus, partial penetrance can be explained by an inherited Rb allele with reduced inhibitory functions toward either the CMG or Ctf4/Polα, but not both, depending on the specific exon that is lost from the RbN domain [187]. Notably, there are other proteins that interact with RbN [189, 193], so the mechanisms mediating partial penetrance are likely more complex than simply claiming the CMG and Ctf4/Polα as explanations.

RbN has been shown to provide a necessary function for inhibiting G1-S transit, independent of the carboxy-terminus of Rb [187, 194, 195]. The control over Ctf4/ Polα and the CMG using specific exon domains of RbN discussed above provides a molecular understanding [32, 187]. RbN binds to the CMG through direct interactions with Mcm7, which are diminished when Ex7 is missing from RbN [32, 124, 187, 188]. Full-length Rb can also bind to the CMG, and this is independent of phosphorylation status of the carboxy-terminal region of Rb [124, 187, 188]. As such, the CMG remains a direct target of Rb in late-G1, after the Restriction Point when Rb becomes phosphorylated [98, 124, 187, 196]. This explains why Rb can mediate TGFß1 inhibitory signals toward the CMG in late-G1, and why RbN is required to mediate control over G1-S transit [124, 187, 194, 195]. Importantly, it is not known how Rb/RbN biochemically controls the CMG via Mcm7 interactions [187]. Does Rb inhibit ATP hydrolysis or fork-unwinding activities of the CMG? Or does Rb interfere with an unknown partner for the CMG? Future studies may uncover answers to these questions. Besides regulating G1-S transit, Rb is also important for mediating ongoing DNA replication arrest in cells exposed to ionizing radiation or fork-stalling drugs [197,198,199]. The absence of Rb leads to errant and excessive DNA replication, creating a form of genomic instability called hyperploidy [197]. Though not formally shown, a plausible target of Rb in suppressing DNA replication and hyperploidy under such conditions may be the CMG helicase.

Reserve MCM/CMGs resemble tumor suppressors in experimental models

CMG helicases are quite logically involved in promoting cell cycle progression and DNA replication, responding to positive growth factor pathways and the cell cycle, and being targeted for inhibition by negative growth factors such as TGFß1. At first glance, this could suggest the possibility that CMGs (and MCMs) might be found in some human malignancies to function as growth drivers, analogous to oncogenic growth drivers. However, there is an absence of genetic evidence in human cancers showing that components of the CMG (or the CMG helicase on the whole) are oncogenic drivers of tumorigenesis, mutated or amplified with gain-of-function outcomes as occurs with known oncogenic drivers (see discussion below). On the contrary, in certain experimental models MCM/CMGs actually display qualities of tumor suppressors, and this almost certainly derives from mismanagement of the extra reserve MCM/CMG complement in cells. Although there are currently no genetic demonstrations that MCM/CMGs are, in fact, tumor suppressors in any particular human malignancy, genetic studies in mice suggest that intact wildtype MCM proteins function to suppress tumor initiation [200,201,202,203].

In one study, mice carrying a single engineered mutation in the Mcm4 gene referred to as the Mcm4Chaos3 allele, but no other genetic changes in oncogenes or tumor suppressor loci, are tumor prone [203]. Tumors that arise include primarily breast adenocarcinomas, but also some lymphomas or histiocytic sarcomas [203]. Why this limited set of tumors arises when Mcm4 is mutated, which could affect many cell types in the animals, is a curiosity that remains unexplained. Fibroblasts from Mcm4Chaos3 mice have increased DNA damage, stalled replication forks, and activated fork recovery events even though such cells were not subjected to replicative stress from outside influences such as fork-stalling drugs [202, 203]. The unperturbed S-phases in Mcm4Chaos3 fibroblasts are stressed for DNA replication processes, and biochemically this is derived from the presence of weakened MCM hexamers [202]. The mutant CMGs created from these Mcm4Chaos3 MCM hexamers are enzymatically unhindered when tested in vitro in assays examining fork-unwinding ability [202]. However, the mutation in Mcm4 results in an apparently destabilized Mcm4 protein and an associated reduction in Mcm7 protein level [203]. This leads to a lowered chromatin binding capability by the entire complement of MCM hexamers, yielding functionally weakened MCM hexamers, including the extra reserve MCMs [202]. Based on our understanding of the roles the extra reserve MCMs play in cells, weakened MCM reserves render such cells less capable of responding to any normal fork-stalling events that occur during DNA replication, but likely also contribute to changes in replication fork speeds that yield faster, lower fidelity replisome movement [139]. Together, these events lead to DNA damage and increased replicative stress, destabilizing the genome [139]. Intriguingly, the RS induced in these unperturbed fibroblasts from the MCM4Chaos3 mice appears to be accepted by the cells, being low enough to evade cell cycle checkpoint arrest of cells [202]. This ongoing RS causes problems for chromosome segregation and allows acquisition of genomic deficiencies, likely driving the tumorigenesis that is seen [202]. These results indicate that failure to developmentally maintain a proper, healthy complement of MCMs, particularly MCM reserves, leads to DNA stress over time that can promote genomic instability and evolutionarily drive tumorigenesis.

A similar situation exists in mice with reduced expression of Mcm2 protein. Using genetic modifications to the Mcm2 locus, investigators found that reducing Mcm2 expression to approximately one-third of normal levels, which would diminish the extra reserve MCMs (and did co-reduce Mcm7 protein), also results in early onset cancer development [200, 201]. Tumors that appear include B-cell and T-cell lymphomas, thymomas, liver cancers, and lung cancer, and polyps were present at times in the intestine and colon [200, 201]. While there is some overlap of lymphoma development between these Mcm2-deficient mice and the MCM4Chaos3 mice, it is again curious that only a limited number of tumor types arise. The authors noted that the genetic background of the Mcm2-deficient mice influences tumor outcomes [200, 201], suggesting that tumor development due to MCM mutations involves additional unknown genetic conditions, perhaps in certain cell types, to promote specific tumor formation in these mouse models. Under conditions of replicative stress, fibroblasts from mice with diminished Mcm2 display low levels of replication origin usage [200]. Under normal conditions, slight elevation in DNA damage response indicators was present but the cells seemed accepting of it [200], similar to the Mcm4Chaos3 situation [202]. Although not shown experimentally, the small increase in damage could be due to reduced origin usage in the context of lowered Mcm2, leading to incomplete DNA replication. When mated to mice lacking p53 expression, a significant reduction in viable offspring is seen, since the presence of low levels of Mcm2 concurrent with an absence of p53 reduces viability of cells during embryogenesis [200]. However, in the small number of mice that are born a more rapid onset of tumorigenesis is seen. One reason for this appears to be an increase in genomic damage at the cellular level in mice with both reduced Mcm2 and p53 loss compared to mice with only Mcm2 deficiencies [200]. Reduced MCM hexamers produce an environment that renders cells less capable of controlling replication fork speeds and recovering from resultant DNA damage [139]. Loss of p53 likely intensifies the genomic instability that ensues, removing an important tumor-suppressive DNA damage sensor, thereby leading to a synthetically lethal condition in many embryonic cells or allowing increased DNA damage to remain in cells that survive [200].

The results from these mice studies demonstrate that small changes in MCM reserves (dormant origins, and extra MCMs that regulate fork speeds) lead to increased genomic instability and tumorigenic outcomes. Importantly, mutations or reduced functioning in MCMs resemble that of p53 or Rb mutations/loss in tumorigenesis, where genome stability is reduced in the tumor evolutionary process due to loss of tumor suppressor function. Thus, cells must maintain proper reserve MCM functionality, which is tumor-suppressive, while mismanagement of reserve MCMs is a tumor-driving situation.

Oncogene overexpression mismanages CMGs and creates replicative stress

Certain oncogenes have been found to mismanage MCM/CMG assembly and/or activation, suggesting a mechanism by which they can drive tumorigenesis though debilitation of the reserve complement of MCM/CMGs [3]. Although Myc is conventionally thought of as a transcriptional regulator, Myc also has an important non-transcriptional role in regulating activity of CMG helicases [4,5,6, 204]. Myc regulates the assembly of CMGs by promoting the recruitment of Cdc45 and GINS to MCM hexamers [5, 6, 204]. Myc achieves this in part by recruiting two histone acetyltransferases, GCN5 and Tip60, to Myc-bound chromatin sites, thereby leading to decondensation of higher-order chromatin and creation of access for Cdc45 and GINS to bind to resident MCM hexamers [204]. A specific domain of Myc, Myc-Box II, is required for this stimulation of CMG assembly and activation [204]. Myc promotion of CMG function also involves regulation of necessary kinase activity [5]. Intriguingly, there are hints that Myc may stimulate CMG activity in a more direct manner, since Myc is found in complexes with Cdc45 when CMGs are stimulated and Myc can interact with the Mcm7 protein [204, 205]. However, a direct role for Myc in CMG binding and stimulation awaits further investigation. These mechanisms underlying how Myc regulates CMG assembly, together with the many other roles for Myc in promoting transcription and other cell cycle events, collectively explain why Myc can override TGFß1 growth arrest [184] (Fig. 3) and promote S-phase entry. Intriguingly, Myc cannot override TGFß1 if Myc is overexpressed specifically in late-G1 [184], perhaps because Myc will have no MCM hexamers or CMG components to stimulate if TGFß1 has blocked MCM/CMG presence earlier in G1 [77].

Overexpression of Myc leads to over-activation of CMG helicases (Fig. 4) [3, 5, 6, 204]. These over-activated CMGs decrease inter-origin distances and are necessarily derived from the reserve pool of dormant reserve MCMs being converted to CMGs [5]. Such effects of Myc overexpression on the CMGs are a problem for cells, as excessive Myc is known to induce replicative stress in an acute manner, far too quickly to be explained by changes in transcription profiles such as metabolic gene stimulation [5, 6, 206]. Myc overexpression produces acute onset of genomic instability, fork rate slowing, fork asymmetry, and DNA damage [3, 5, 6, 206], with such outcomes aligning with observations seen when extra MCM reserves are reduced [139]. The DNA damage induced by Myc strictly requires CMG over-activation, and experimental approaches that promote excessive Cdc45 or GINS recruitment to MCMs can phenocopy these effects of Myc in creating acute DNA damage responses [5]. Thus, one mechanism by which overexpressed Myc can drive tumorigenesis is through excessive CMG stimulation and consequent mismanagement of the reserve complement of extra MCM/CMGs, which creates DNA damage. Another mechanism derives from the Myc-induced increase in replication fork density and resultant structural problems within certain segments of the genome [5]. Myc-induced fork density changes and reduction of MCM reserves likely compound this situation to reduce genome stability. Ironically, deregulation of the CMGs (and MCM reserves) by Myc may create the initial replicative stress in terms of altered DNA replication fidelity and fork stress, and to recover from such RS cells need the full complement of reserve MCM/CMGs that Myc has perturbed. Thus, elevated Myc would appear to cause and exacerbate genomic instability. This ability of Myc, a cancer-driver overexpressed in the majority of human malignancies, to create MCM reserve deficiencies as a likely secondary driver for cancer development is similar in concept to reduced MCM functionality driving cancer in the Mcm mutant mouse models.

Fig. 4: Oncogenes such as Myc and Cyclin E mismanage the reserve MCM/CMGs.
figure 4

Oncoproteins such as Cyclin E or Myc, when overexpressed, cause problems in the dynamics of MCM loading and usage. Cyclin E elevation suppresses MCM loading (black squares), while Myc elevation leads to excessive stimulation of CMG helicases (black arrows). Altering MCM/CMG levels adversely affects replication fork speeds and origin usage, and reduces MCM/CMG reserves. These oncogene-driven effects on the extra reserve MCM/CMGs generate replicative stress (RS) and DNA damage, and create a CMG vulnerability wherein tumor cells have difficulty responding to the RS, or to additional fork-stalling or fork-destabilizing stresses.

Overexpression of Cyclin E also causes mismanagement of MCM reserves [3, 7]. Elevated Cyclin E is associated with acute onset of genomic instability, and promotes abnormal origin firing, collapsed replication forks, and double-strand DNA breaks [207,208,209], again aligning with that seen when the extra reserve MCMs are reduced [139]. Cyclin E deregulation also appears to contribute to chromosomal rearrangements and genome duplication that are present in cancers [207, 209]. At the mechanistic level, these events likely derive in part from an ability of Cyclin E overexpression to cause a reduction in the number of MCM hexamers that are loaded onto DNA (Fig. 4) [7]. This necessarily depletes the reserve pool of MCM/CMGs that are needed for responding to RS during S-phase and limiting DNA replication fork speeds to prevent DNA damage [139]. As discussed above, Cyclin E-Cdk2 plays complex roles in MCM assembly, appearing to promote MCM subunit chromatin binding, but limit MCM hexamer assembly when deregulated [7, 84, 85]. Deregulated Cyclin E also causes premature entry into S-phase, which may contribute to the MCM deficiency by promoting S-phase entry before enough MCMs have loaded [209]. Thus, similar to the situation for Myc, elevated Cyclin E creates RS due to MCM reduction and other mechanisms that is then difficult to recover from due to a reduction of dormant reserve MCM origins.

Notably, Myc or Cyclin E may not affect all MCM (reserve) hexamers in all parts of the genome evenly. Depending on the timing and degree of Myc or Cyclin E overexpression, or their physical locations of DNA interactions, certain regions of the genome may be more susceptible to altered MCM/CMG management by these oncogenes. In addition, the transcriptome of a particular cell may influence chromatin states and locations where oncogenes more adversely affect MCM/CMG loading levels. It is currently unclear whether mammalian MCMs are loaded onto chromatin in an evenly-distributed manner or asymmetrically throughout the mammalian genome, nor if certain chromosomal regions contain more or fewer local MCMs relative to other domains. However, in yeast, it has been shown that certain regions of the genome have enriched levels of dormant MCMs loaded onto chromatin [210]. As such, one could speculate that if regions of the mammalian genome exist with a lower density of local MCM reserves, then oncogene mismanagement of MCMs in such regions could yield more RS and DNA damage relative to regions that have higher MCM densities. In sum, oncoproteins such as Myc or Cyclin E may produce tumor heterogeneity at the genomic level as a result of stochastic MCM mismanagement and DNA damage that drives evolutionary selection of tumor cells with certain growth advantages.

Related to the above oncogene-induced effects, MCM mismanagement and genomic instability can also be induced by whole genome duplication events, independent of oncogene activation [211]. Cells that are induced to become tetraploid display elevated DNA damage during DNA replication in the first S-phase following a genome duplication [211]. Regions of the tetraploid genomes are under-replicated, while other regions are over-replicated. At the mechanistic level this appears to be due to a diminished level of necessary DNA replication factors, including MCM and Cdc45 components [211]. Intriguingly, a logical interpretation of these results is that MCM/CMG reserves are mismanaged in the tetraploid cells, being reduced stoichiometrically relative to the increased DNA content. Once MCM/Cdc45 levels (and other factors) were increased to accommodate the extra genomic material, the cells displayed less genomic instability [211]. Such findings are consistent with a need to tightly regulate the levels of MCM/CMG complexes in a cell to maintain genomic stability.

Human cancers and the “missing mutations” in MCM/CMGs

Given that mismanagement of reserve MCM/CMGs can (experimentally) promote tumorigenesis, and that oncogenic pathways can cause such MCM/CMG mismanagement and DNA damage, one might predict that at the genetic or protein levels the CMG itself or some of its subunits will be mutated, overexpressed, or under-expressed in human cancers, with such CMG genetic alterations resulting in cancer-driving conditions or loss of tumor-suppressive function, similar to that observed for well-established cancer-drivers or tumor suppressors (e.g., Myc, Ras, p53, or Rb genes). However, to date, genetic evidence is lacking that can demonstrate CMG components are oncogenic cancer-drivers or tumor suppressors in human cancers.

When examining many human tumor tissue samples by immunohistochemical (IHC) methods one typically sees elevated expression of all CMG subunits analyzed, often referred to as “overexpression” of CMG subunits that might suggest a cancer-driving or cancer-promoting situation. For a thorough review and collation of such tumor analysis findings the reader is referred to a comprehensive discussion [212]. Analysis of elevated CMG subunit expression in human tumor tissue has also been used as a novel biomarker to identify malignant and pre-malignant tissue and as a predictor of survival, with higher MCM, Cdc45, or GINS levels, for example, often associated with worse outcomes [25, 26, 213,214,215]. However, elevated tumor tissue expression of CMG components in histopathological samples does not provide conclusive evidence that, on a per cell basis, the CMG subunits (all 11) are actually overexpressed stoichiometrically relative to the number of CMG subunits expressed in normal (non-tumor) proliferating cells. There is evidence that in some established cancer cell lines several of the CMG components may be overexpressed, sometimes based on mRNA levels [25, 27, 28, 212]. However, it has also been shown in a quantitative study that proliferating tumor lines and non-tumor proliferating cells contain roughly equal protein numbers of the CMG subunits [138]. Quite clearly, elevated IHC staining for CMG subunits indicates that CMGs are more visible in proliferating tumor tissue relative to that seen in non-tumor, differentiated, and non-proliferating (or lower-proliferating) neighboring tissue. However, elevated IHC staining observations will be evident for many proteins involved in cell cycle progression because tumor tissue is in a proliferative state. Importantly, higher IHC staining or elevated expression in a tumor cell line does not demonstrate that such “overexpressed” proteins are drivers of the cancers.

Information from large-scale genomic sequencing efforts on human tumor samples indicates that the CMG enzyme is a “never-mutated” protein complex in tumor cells, from the perspective of failing to find demonstrable CMG cancer-driving or inactivating mutations. A search of more than 68,000 human tumor samples analyzed in 205 studies in the publicly available database cBioPortal (www.cbioportal.org; accessed October 23, 2022) [216, 217] finds that, while the loci coding for the 11 CMG subunits are mutated or amplified to a small degree in some cancers (~1–3% for each CMG subunit in this analysis), a refined search for each of the CMG subunit loci finds that known cancer-driving or inactivating alterations or hot-spot mutation sites are not evident (Fig. 5A, B). This is in contrast to known cancer-driving mutations/amplifications in Myc, Ras, or other oncogenic loci that are highly evident across human cancers, as are loss-of-function or inactivating mutations in the loci for tumor suppressors such as p53 and Rb. Human cancers have elevated DNA damage and mutation rates, which predicts that even the 11 CMG subunits will be subject to stochastic genetic changes in at least some human cancers, consistent with the data in the cBioPortal collection. However, using Mcm7 coding region mutations/alterations as an example (Fig. 5C), the mutation profile residing in human cancers for Mcm7 is indicative of randomly distributed passive mutations throughout the coding region, and lacking potentially interesting hot-spot driver alterations. The cBioPortal dataset shows that other CMG subunits have similar mutation profiles to Mcm7, and not unlike that also seen for ß-Actin, which is arguably not a cancer-driver.

Fig. 5: Human cancers do not contain known cancer-driver mutations in CMG components.
figure 5

A Search results from the publicly available cBioPortal database of genetic analyses on 68,088 tumor samples from 64,959 patients in 205 independent studies (performed October 23, 2022) for mutations, amplifications, deletions or other alterations in the 11 CMG helicase components. The small percentages that each CMG subunit are altered across all human tumor samples analyzed are indicated on the left (asterisks), and grouped by alteration type. CMG components are altered in 4949 (7%) of all tumor samples. Study origins used for tumor analyses are indicated at the top by color coding (see cBioPortal site for publicly accessible information and details of study origins). GINS1-3 are Psf1-3, respectively; GINS4 is Sld5. B Refined cBioPortal search removing mutations, variants, and copy number changes (n = 7280) of unknown significance of the same 68,088 human tumor sample cohort based on sites-option set to filter based on current understanding of such genetic changes. C Mutation profile for Mcm7 obtained from cBioPortal search of 68,088 human tumor samples (majority are missense mutations). Mcm7 N-terminal and ATPase domains with homology to other MCM subunits are shown in green and red, respectively. The number of times a given change in Mcm7 is found in sampled human cancers is indicated by the Y-axis. Note that unlike that seen for many oncogenes or tumor suppressors, hot-spot mutation sites that might indicate cancer-promoting changes are not evident for Mcm7, and the profile is instead that of randomly distributed passive mutations across human cancer samples.

Importantly, while current research information suggests that there are no CMG mutations in human cancers known to be capable of driving tumorigenesis, future studies might uncover such a situation (e.g., an Mcm cancer-driving mutant) in at least some human malignancies. In addition, it is possible that deregulation of individual CMG subunits might affect cell growth for reasons independent of CMG enzyme control itself. For example, elevated Mcm7 expression has been seen in prostate cancers, and can promote aggressive characteristics of prostatic tumors [218]. Although unclear, one mechanism underlying this could be derived from the ability of Mcm7 to bind Rb and interfere with Rb signaling pathways [32, 124, 187, 188]. However, it has also been found that overexpression of Mcm3, Mcm7, or Cdc45 as single subunits is not easily tolerated by mammalian cells and can elicit a checkpoint response with failure to enter S-phase [84, 85, 138, 219], suggesting that the stoichiometry of the 11 CMG subunits is important and necessary for cell cycle progression and cell viability. Although CMG (and MCM) deficiencies can cause tumorigenesis, as shown in genetic-based animal studies, and CMGs/MCMs can be mismanaged by oncogenes to create replicative stress that promotes tumor formation, the paradox currently exists that the CMG helicase is simply not mutated in human cancers in a cancer-driving or function-inactivating manner similar to other known oncogenic drivers or tumor-suppressors. To use Abraham Wald’s analogy to the missing bullet holes [1], the CMG is “missing mutations”.

Why would this paradox exist for the CMG? The tumors analyzed by scientists are the final products of an evolutionary selection process for survival. It is quite possible that early in the tumorigenic conversion process CMG mutations might occur, or even later in cancer development due to elevated mutation rates in cancers. Importantly, any such mutations in CMG subunits would affect the entire complement of CMG helicases (and MCMs) throughout the genome. However, this may not be compatible with survival if other necessary oncogenic or tumor-suppressive genetic changes occur in the same tumor cell. For example, most human tumors display genetic changes to p53, leading to its loss or diminished function. However, loss of p53 function in the presence of reduced reserve MCMs, which promotes tumorigenesis in the small number of mice that are born, also creates a synthetically lethal situation that significantly reduces viability due to increased genomic DNA damage [200]. When a p53 mutant tumor cell also acquires Myc overexpression, for example, to drive changes in CMG reserves (and other Myc-driven changes to transcription), then survival of the tumor cell may be seriously compromised if the entire complement of MCM/CMG helicases is also mutated or altered in function in any further way. In this manner, the CMG will be “missing mutations” due to a loss of CMG-mutated cells from the final evolutionarily determined tumor profile, similar to how Wald’s aircraft with missing bullet holes fails to arrive at the airbase due to their loss in battle [1].

Targeting CMGs: a specific liability for tumor cells

The ability of oncogenes such as Myc or Cyclin E to drive MCM/CMG mismanagement, while considered a “strength” or selection advantage that promotes genomic instability, heterogeneity, and tumorigenesis from the overall tumor’s perspective, actually creates an exploitable vulnerability for individual tumor cell survival. Normal, non-tumor (host) cells that are proliferating have no (or fewer) mutations/alterations in Myc, Cyclin E, p53, or Rb, all of which are involved in MCM/CMG reserve mismanagement when deregulated in tumor cells. Therefore, non-tumor cells contain a full complement of MCM/CMGs, including reserves, and have the capacity to properly respond to replicative stresses that they encounter, and maintain proper replications fork speeds [139]. However, tumor cells that have acquired MCM/CMG reserve deficiencies due to oncogenic changes will likely have a reduced capacity to respond to replicative stresses and manage fork speeds. Such tumors would have a selective disadvantage relative to normal cells in responding to chemotherapy drugs that stall replication forks [38]. But more importantly, compared to non-tumor cells, tumor cells with reduced MCM/CMG reserve capacity would be predicted to be sensitive to CMG inhibitors (CMGi) that might deregulate replication fork speeds, create further replicative stress, and simultaneously inhibit any remaining MCM/CMG reserves from aiding in recovery from CMGi-induced stress. Thus, whereas normal cells have a “buffer” of reserve MCM/CMG capacity, tumor cells with MCM-mismanaging changes lack such a buffer, or their buffer is diminished, and are predicted to be selectively vulnerable to further CMG inhibition by a pharmacologic CMGi drug or other RS-inducing insults, or both (Fig. 6).

Fig. 6: Rationale for targeting MCMs/CMGs with anti-cancer drugs.
figure 6

Deficiencies in MCM/CMG management in tumor cells create a selectiveness weakness in cancers that can be taken advantage of in the clinic with drugs designed to target MCM/CMG complexes. Oncogene-induced reduced functionality of MCM/CMG complexes, together with synthetically lethal conditions when p53 is lost and MCMs are co-reduced, renders the MCM/CMG complexes a unique vulnerability. The “missing mutation” status of MCM and CMG complexes also defines the helicase as a needed survivability factor in cancers.

The genetic interactions between p53 loss and reduced MCM/CMG reserves present another argument for tumor cells having an inherent and selective vulnerability to CMG inhibition. Loss of p53 function in a background of reduced MCM levels leads to appreciable embryonic lethality [200], indicating that a synthetic lethality is created, or nearly so, such that viability of cells is severely diminished. Replacing MCM loss/reduction with a CMG inhibitor would potentially phenocopy this loss of viability in cells lacking proper p53 function. Again, this would be a tumor-specific situation, likely exacerbated by gains in Myc or Cyclin E expression that further debilitate MCM/CMG reserves. As such, p53 mutations/loss alone are a likely predictor of sensitivity to pharmacological CMG inhibition (Fig. 6).

Illustrating that tumor cells are indeed selectively sensitive to CMG inhibition are results showing that pancreatic ductal adenocarcinoma (PDAC) and colorectal cancer (CRC) cells are reduced in viability when RS is applied to them under conditions where MCM levels are reduced by siRNA-mediated genetic methods [38]. In these experiments, the reserve pool of MCMs was reduced, which did not alone affect tumor or normal cell growth. However, both tumor types were sensitized to RS-inducing standard-of-care chemotherapy drugs when MCM reserves were diminished [38]. The PDAC cells in particular were shown to be selectively reduced in proliferative capacity relative to non-tumor human skin cells, the latter of which would have a buffer of MCM reserves [38]. PDAC are often driven by mutations in K-Ras, which is known to have downstream positive effects on Myc and Cyclin E protein expression levels [220,221,222,223,224,225], and CRC are associated with changes to Myc and Ras [226, 227], consistent with such tumor cells likely mismanaging the MCM/CMG reserves. These studies indicate that a therapeutic window exists in these and likely other tumor cells for sensitivity to pharmacologic inhibition of the CMG.

Drugging the CMG with CMG inhibitors (CMGi)

In the anti-cancer arsenal, there are many efficacious drugs targeting enzymes directly or indirectly that comprise the replisome and associated factors necessary for DNA replication. Topoisomerases are targeted with doxorubicin, etoposide, and the camptothecan family of drugs, while DNA polymerases are indirectly targeted by interfering with nucleotide pools (e.g., methotrexate inhibiting DHFR, gemcitabine inhibiting ribonucleotide reductase). Clearly, the human CMG (and MCM complex) also has merits as a DNA replication target for anti-cancer approaches, and it is a druggable enzyme (see also discussion below on therapeutic considerations). An attractive and tractable means of developing CMGi against the CMG could focus on identifying compounds that inhibit one or more ATP clefts of the MCM core of the CMG.

The binding and hydrolysis of ATP by kinases typically occurs within a pocket generated by a single polypeptide chain. The ATP catalytic clefts of the CMG helicase are distinct from this design. ATP binding and hydrolysis occurs within catalytic clefts formed between each pair of MCM dimers that make up the MCM hexamer [10, 19, 30, 228, 229]. This mode of ATP binding and hydrolysis by the CMG is similar to that found in other eukaryotic and viral hexameric helicases such as SV40 large-T antigen [230]. In each cleft that is created by adjacent MCM subunits, one MCM subunit (the cis subunit [230]) contributes the P-loop (canonical sequence GXXGXG) with phosphate-interacting “GKT/S” motif, while the other subunit (the trans subunit [230]) contributes a necessary arginine finger motif [10, 19, 42, 228, 229]. Using the ATP cleft of the Mcm3-7 pair as an example, the Mcm3 trans subunit provides the catalytic arginine residue, while the Mcm7 cis subunit provides the ATP binding P-loop with phosphate-interacting lysine residue (in Mcm7 sequence is MGDPGVAKSQ; K is amino acid 387). The six distinct ATP hydrolytic clefts of the MCM hexameric core of the CMG work in a non-symmetrical and combinatorial manner to hydrolyze ATP and alter MCM subunit structures to achieve CMG movement during DNA unwinding, and are required for MCM hexamer loading during G1 phase [9, 10, 19, 20, 42]. Thus, CMGi targeting one or more ATP clefts will likely inhibit MCM assembly and CMG function and would represent a new class of anti-cancer compounds.

An alternative approach to developing CMGi could focus on the identification of compounds that mimic and interfere with the binding site interactions between Cdc45 and MCM subunits, or between the GINS complex and the MCM hexamers (or between GINS subunits). Although more difficult to develop than ATPase inhibitors, such peptidomimetic chemical compounds would likely have a different mode of action for cell growth inhibition relative to compounds that inhibit ATPase activity of the MCM hexamers. Blocking Cdc45 binding, for example, would inhibit CMG assembly and possibly cause dissociation of Cdc45 from active CMG helicases during S-phase, thereby inhibiting only CMG activity. MCM assembly onto DNA occurs in G1 and requires ATP binding and hydrolysis [20, 42], but does not require Cdc45 or GINS association. As such, drugs that interfere with Cdc45 binding to the CMG would not affect MCM binding to DNA, sparing MCM hexamers and particularly reserve MCMs, from drug effects. In contrast, drugs that target MCM ATPase function would affect MCM assembly, reserve MCM levels, CMG assembly, and CMG activity. Due to the wider set of events that would be suppressed by MCM ATPase inhibition, it is possible that such an approach could elicit more toxic side effects in the clinic, particularly when considering systemic off-tumor effects on non-tumor host cells. However, it is also possible that MCM ATPase targeting, versus inhibiting Cdc45 binding and CMGs alone, might offer clinical anti-cancer advantages due to the ability to target and reduce the assembly of reserve MCMs in tumor cells that already contain weakened MCM reserves due to oncogenic signals.

Therapeutic hypotheses and considerations for using CMGi as anti-cancer agents

Several arguments have been made in this review justifying the use of pharmacologic inhibitors of the CMG as anti-cancer agents in the clinic. First, tumors carrying mutations or elevated expression in Myc or Cyclin E (and likely Ras) are predicted to have mismanaged the CMG helicases by reducing reserves. This leads to increased RS, a diminished response to RS, and faster (lower fidelity) replication fork speeds [139]. Second, tumors with p53 (TP53) mutations or loss of function are predicted to contain an environment that is sensitive to CMG inhibition, producing synthetically lethal conditions. Third, the CMG is absolutely required for G1-S transit and tumor cell survival, and this particularly applies to any remaining levels of MCM/CMG reserves after oncogenic mismanagement. Indeed, the fact that the CMG is “missing mutations” (activating, or loss of function) in human cancers is a tumor-evolutionary indicator that at least a nominal level of CMG function is necessary for tumor survival. Notably, these arguments and oncogene-induced CMG weaknesses are specific for tumor cells relative to normal host cells, suggesting tumor cells will be selectively sensitive to CMGi, ideal for a chemotherapeutic target (Fig. 6).

A question that can be raised is whether future CMGi will have advantages to current chemotherapeutic treatments in the clinic. Many standard-of-care chemotherapy drugs directly or indirectly target enzymes involved in DNA replication, such as topoisomerases, ribonucleotide reductase, dihydrofolate reductase, or DNA polymerases (doxorubicin, etoposide, gemcitabine, methotrexate), or create DNA damage (platinum drugs, alkylating agents). Tumor cells are generally more sensitive to these compounds due to increased proliferation, heightened ongoing DNA damage, or deficiencies in DNA repair relative to normal host cells. However, most of these chemotherapy drugs have elevated toxicities in patients, with narrow therapeutic windows. One argument for this is likely that normal host cells also require these enzyme functions, particularly cells that are proliferating, and as a general rule these chemotherapy enzyme targets are not themselves known to be functionally weakened in tumor cells (nor in normal cells). The CMG helicase is likely also a viable chemotherapy target for these same reasons, being necessary for DNA replication and responses to DNA damage, and it is also true that CMGi could elicit some off-tumor toxicities due to similar roles for the CMG in normal host cells. As such, the development of future CMGi would provide an alternative anti-cancer chemotherapy agent, with different pharmacologic properties, that could be similarly useful against select cancers.

However, there are also unique advantages to using CMGi in place of chemotherapy (or with chemotherapy as a sensitizer) due to inherent weaknesses and mismanagement of the MCM/CMG complexes created by oncogenic changes (e.g., Myc or Cyclin E overexpression) and other tumor-specific vulnerabilities discussed above. In tumor cells, mismanagement problems render the MCM/CMG complexes, as drug targets, mechanistically distinct weaknesses compared to current chemotherapy enzyme targets. We propose that targeting the MCM/CMG complexes with future CMGi/MCMi offers the potential for an innovative tumor-selective molecular targeting approach with unique advantages over chemotherapy. Indeed, certain tumors with overexpressed Myc or Cyclin E might be particularly sensitive to CMGi versus standard chemotherapy, and inherent weaknesses in the MCM/CMG complex might provide a wider therapeutic window for CMGi/MCMi use against other cancers. In addition, as described above, developing inhibitors that target the CMG specifically (CMGi) or the MCM ATPase domains (MCMi) could provide different clinical outcomes in terms of off-tumor toxicities.

As an example for an innovative use of CMGi in the clinic, osteosarcoma (OS) is an aggressive bone tumor in adolescents and young adults that has been examined by a large-scale genomic sequencing effort called the TARGET-OS Project [231]. Although no single mutated driver of OS was identified in these analyses, the vast majority of tumors demonstrate cell cycle dysregulation with subgroups of OS with gains in Myc or Cyclin E (CCNE1), loss of Rb, or gain of Cdk4 (which regulates Rb). In addition, the vast majority of OS have completely lost p53 protein expression, or TP53 is mutated [232]. Although each of these genetic alterations in OS affect proteins that are not themselves druggable, the collective nature of these genetic changes would imply that the CMG helicase is a potential vulnerability across many OS subtypes. As such, new clinical approaches to OS treatment, or for other tumor situations where a known driver does not exist, might consider incorporating CMGi as a pharmacologic intervention in patients stratified for known CMG-modifying genetic changes such as amplified Myc, CCNE1, and/or TP53 loss.

Related to the above discussion on chemotherapy versus targeting the CMG, CMGi/MCMi might be useful in enhancing the effectiveness of existing chemotherapy drugs that induce replicative stresses requiring proper function of the CMG in recovery efforts. The combination of CMGi plus chemotherapy could potentially allow a reduced level of chemotherapy drug to be administered to alleviate off-tumor toxicities, while the CMGi would inhibit the CMG enzyme required for recovering from chemotherapy insults. A clinical consideration for use of CMGi in this manner is the timing or scheduling of CMGi administration in combination with standard-of-care chemotherapeutic regimens. If a CMG inhibitor blocks MCM loading in G1, then it is possible that pre-treatment with a CMGi may reduce the effectiveness of fork-stalling chemotherapy if treated cells fail to enter S-phase. Conversely, it may be more advantageous to treat with a CMGi after such chemotherapy exposure, where the chemotherapy induces replicative stress in tumors that then critically require (remaining) CMG reserves to be functional.

It is also important to consider the duration of exposure to CMGi, as the mouse studies would suggest that long-term CMGi treatment might create reduced genomic stability and DNA damage even in normal cells, leading to unwanted secondary tumor development [200,201,202,203]. Clearly, this is also true of conventional chemotherapeutic drugs that damage, crosslink, or alkylate/modify DNA. However, tumor development in mice required conditions in which MCM levels were constantly reduced during the entire developmental timeframe of embryogenesis to adulthood [200, 201, 203], which is arguably not a situation that would be encountered in the clinic when using CMGi.

The CMG could be targeted clinically using indirect methods, with inhibitors to proteins that regulate CMG assembly and/or activation. For example, Cdc7 kinase inhibitors are being tested in the clinic, and Cdc7 is required for CMG activation [233,234,235,236,237,238,239,240,241,242]. However, kinase inhibitors sometimes have activity against other kinases, or the targeted kinases regulate other important cell cycle events aside from CMG management, making interpretation of their clinical effects difficult to assign specifically to CMG inhibition [3]. Other future druggable targets that regulate the CMG include ORC and Cdc6, but specific inhibitors/drugs against these enzymes do not currently exist.

Finally, an emerging clinical argument for using CMGi may derive from the unique roles of the CMG helicase in DNA replication and recovering from DNA damage or fork-stalling conditions [243]. Many solid tumors contain deficiencies in DNA repair, either as an early genetic driving condition (e.g., inherited Brca1/2 mutations), or acquired during the tumorigenic process [244, 245]. These deficiencies can be in homologous repair (HR), non-homologous end-joining repair (NHEJ), base-excision repair (BER), or nucleotide-excision repair (NER). It is possible that signatures in tumors indicative of DNA repair deficiencies may predict increased sensitivity to CMGi, as the co-existence of multiple DNA repair/recovery deficits may create a synthetically lethal situation specific to certain tumors. Emerging characterization signatures for chromosomal instability may be biomarkers to help guide patient selection for future trials investigating this possibility [246, 247]. Some of the genes to consider in this regard include mutations in Brca1/2, Fanconi anemia (FANC) genes, ATM, PalB2, or NBS. Similarly, combination of CMGi with existing DNA repair inhibitors such as PARP inhibitors or ATR inhibitors may show promise in the clinic for certain malignancies.

Closing thoughts

Although we have focused on a select few cancer-promoting gene products as being MCM/CMG helicase (mis)regulators, namely Myc, Cyclin E, p53, and Rb, there are undoubtedly other genes and protein products that will be found to have tumor-driving influences on the CMG and genome stability. Similarly, we cannot rule out that Mcm, Cdc45, or GINS mutants may be discovered in certain human malignancies that might slightly alter CMG function but remain compatible with survival. There are also other cancer-relevant CMG issues awaiting further investigation that we have not discussed here. For example, Mcm3 interacts with Keap1, a regulator of Nrf2 and responses to oxidative stresses [248, 249]. What this interaction means functionally for the CMG has not been unraveled. MCMs also interact with cohesins in regulating chromosome topology [250, 251], and MCMs are required for loading of cohesins [252,253,254], which link daughter chromatids after DNA replication and may influence condensation in mitosis [255,256,257]. There are also associations of MCM and ORC subunits with centrosomes, which play a role in DNA replication and mitosis [258, 259]. Mechanisms by which changes in MCM/CMG reserves affect cohesion, centrosomes, and mitosis may predict chemosensitizing synthetic effects for CMGi and mitotic drugs such as taxanes.

Our goal here was to dissect out some of the known cancer-driving influences on MCM/CMG assembly and activation dynamics, to illustrate that while the CMG may obviously appear to be passively directed by many cell cycle processes and signals, once properly assembled and regulated, the CMGs and MCMs also perform active roles in the fidelity of DNA replication, recovery from stresses, and regulation of fork speeds, all of which collectively maintains genomic stability. Abraham Wald’s mathematical and theoretical genius clearly applies to the CMG helicase and offers explanations for why the CMG is “missing mutations” in cancer. Human cancers are telling us that the CMG is, like Wald’s airplane engine, critical for tumor cell survival. Weakened by oncogenic changes, wildtype CMG becomes a tumor-specific weakness that should be targeted with drugs to selectively inhibit cancer growth.