Keywords

The archaea are a diverse range of microorganisms that share more recent evolutionary history with eukaryotes than do the bacteria (Woese and Fox 1977). The precise timing of the divergence of the archaeal and eukaryotic lineages is the subject of considerable debate, with some studies even suggesting that eukaryotes arose from within the archaeal domain of life (Williams et al. 2013; Rivera and Lake 2004; Forterre 2015). A number of phyla have been identified within the Archaea; again controversy exists regarding the precise nature of the taxonomic divisions between archaeal phyla. With increased sampling, particularly at the metagenomic level, some degree of consensus is being established. It is generally accepted that there is a broad divide between the phylum of the Euryarchaea and those of the Thaumarchaea, Aigarchaea, Crenarchaea, and Korarchaea. The latter four taxonomic groupings appear more closely related to one another and have been termed the “TACK superphylum” (Guy and Ettema 2011; Brochier-Armanet et al. 2008; Forterre 2015). At the morphological level, archaea are prokaryotes; most species have a single cell membrane and are devoid of any organellar structures.

Thus far, all archaea characterized have circular chromosomes; however, the chromosome copy number shows considerable variation across taxonomic divides. To a broad approximation, euryarchaea appear to be generally oligoploid or polyploid, while the members of TACK that have been studied have cell cycles that oscillate between one and two copies of their chromosome (Table 5.1) (Samson and Bell 2014; Breuert et al. 2006; Hildenbrand et al. 2011). Flow cytometry studies have revealed that the TACK superphylum organisms, such as members of the Sulfolobus genus of hyperthermophilic acidophiles, have cell cycles that contain defined gap phases separating DNA replication and cell division (Lundgren et al. 2008; Pelve et al. 2013). These observations have led to the adoption of the G1, S, G2, and M phase nomenclatures established in studies of the eukaryotic cell cycle to describe the analogous stages of archaeal cell cycle progression. It must be emphasized, however, that there is no evidence that archaeal chromosome segregation is in any way related to eukaryotic mitosis. Interestingly, in marked contrast to the orchestrated cell cycles of crenarchaea, the euryarchaea that have been studied appear to lack obvious gap phases, perhaps hinting that cell division can occur during ongoing rounds of replication of the multiple copies of the chromosome, in a manner somewhat reminiscent of fast-growing E. coli (Sherratt 2003).

Table 5.1 Taxonomic distribution of archaeal species described in the text

5.1 The Replication Machinery of Archaea

With the availability of whole genome sequences of archaeal species in the 1990s, it became apparent that archaea possess clear orthologs of eukaryotic DNA replication-associated proteins (Edgell and Doolittle 1997). In general, and in keeping with the organizational simplicity of the organisms, the archaeal replication proteins are simplified versions of their eukaryotic counterparts (Barry and Bell 2006; Kelman and Kelman 2014). For example, the eukaryotic MCM(2-7) replicative helicase has six distinct subunits. However, all six subunits are related to each other in sequence, suggesting derivation from a common ancestor. Indeed, the majority of present-day archaea encode a single mcm gene the product of which homo-multimerizes to form a homohexameric assembly (Costa and Onesti 2009; Bochman and Schwacha 2009). Similarly, almost all archaea encode a protein that is related to both Cdc6 and Orc1 component of eukaryotic origin recognition complex, ORC (Bell 2012). Interestingly, early branching eukaryotes, such as trypanosomes, also encode an archaeal-like Orc1/Cdc6 protein, suggesting that the gene duplication and sequence diversification leading to “higher” eukaryotic Orc1 and Cdc6 occurred within the eukaryotic lineage (Samson and Bell 2016; Tiengwe et al. 2012). Importantly, the bacterial replication machinery, although, ultimately, performing the same function, is largely non-orthologous to the shared archaeal/eukaryotic replication apparatus. The key exceptions lie in the clamp/loader and sliding clamp that facilitate DNA polymerization, leading to the proposal that the elongation machinery is fundamentally conserved and thus ancestral, even though the rest of the replisome components are not conserved between bacteria and archaea/eukarya (Yao and O’Donnell 2016).

5.2 Archaeal Replication Initiation

The first archaeon in which the replication mode was experimentally determined, a euryarchaeon Pyrococcus abyssi, revealed a single origin of replication. The origin, oriC, is located in a gene environment that contained genes for several replication-associated proteins, including the candidate initiator protein orc1/cdc6 gene (Myllykallio et al. 2000; Bell 2012). The orc1/cdc6 nomenclature is cumbersome, and orthologs in archaeal genomes have been variously annotated as orc1 or cdc6 on an apparently random basis. In this chapter, for simplicity’s sake, I will refer to these genes as orc1. Many archaea encode multiple Orc1 paralogs, and I will refer to these as Orc1-1, Orc1-2, etc.

Interestingly, the single-origin paradigm in Pyrococcus species actually appears to be atypical among the archaea, and it is now known that many archaea from both euryarchaea and TACK species have multiple replication origins per chromosome (Robinson and Bell 2007; Robinson et al. 2004; Robinson et al. 2007; Lundgren et al. 2004; Norais et al. 2007; Hawkins et al. 2013; Yang et al. 2015; Pelve et al. 2012). The highest number of origins reported is four per chromosome for lab strains of the euryarchaeon Haloferax volcanii and also the crenarchaeon Pyrobaculum calidifontis (Pelve et al. 2012; Hawkins et al. 2013). For most species, while origin number and location have been established, the extent to which each origin is used remains poorly resolved. The exception to this lies in Sulfolobus species where three origins have been mapped, and these have been experimentally determined to fire once per cell cycle (Duggin et al. 2008). Studies with synchronized cell populations have revealed that two of the origins, oriC1 and oriC3, fire synchronously, thereby defining the start of S phase. Notably, oriC2 fires a few minutes later. As will be discussed below, this temporal delay is likely linked to the expression of the initiator protein that defines this origin.

Many archaea encode multiple Orc1 paralogs. In the case of Sulfolobus, three such proteins, Orc1-1, Orc1-2, and Orc1-3, are encoded in the 2.2–3 megabase-pair genome. Sulfolobus also encodes a further candidate initiator protein, WhiP, that is a distant homolog of another eukaryotic replication initiation protein, the helicase co-loader, Cdt1 (Robinson and Bell 2007).

5.3 Origin Specification

Genetic studies in Sulfolobus islandicus have revealed a simple one-to-one relationship between the location of initiator protein genes (Fig. 5.1) and the origins that they specify (Samson et al. 2013). More specifically, Orc1-1 is encoded adjacent to, and specifies origin function at, oriC1; Orc1-3 is adjacent to oriC2 and is required for function at that origin, and finally the gene for WhiP is beside oriC3, and the WhiP gene product is necessary for oriC3 function. Furthermore, the initiator protein encoded adjacent to each origin is both necessary and sufficient for its cognate origin function. What then of Orc1-2? The orc1-2 gene is not encoded immediately adjacent to any of the three origins, and deletion of orc1-2 does not affect firing at any of the three origins. A range of biochemical and transcriptomic analyses have implicated Orc1-2 as a negative regulator of replication (Robinson et al. 2004; Maaty et al. 2009; Frols et al. 2007; Gotz et al. 2007). However, its role in this regard remains to be firmly established. Thus, the Sulfolobus islandicus chromosome is a mosaic of three distinct replicons, each origin having its own specific initiator. Analyses of the phyletic distribution of the initiator proteins reveal that Orc1-1 is highly conserved across a broad range of archaeal species. For example, the single orc1 gene encoded by Pyrococcus is most closely related to Sulfolobus Orc1-1. Indeed, it was demonstrated that Sulfolobus solfataricus Orc1-1 can bind specifically to conserved sequence elements, termed ORB (origin recognition box), in the Pyrococcus oriC in vitro (Robinson et al. 2004). ORB elements are conserved across the archaeal domain of life and possess a dyad symmetric element flanked uniquely on one side by a G-rich element. Interestingly, all characterized oriC1 origins in archaea possess at least two ORB elements in inverted orientation and separated by an AT-rich candidate duplex unwinding element (see Samson and Bell 2016 for a review). The nature of Orc1-1 interaction with ORB elements is discussed below.

Fig. 5.1
figure 1

Diagram of the organization of the 2.5 Mb chromosome of Sulfolobus islandicus. The relative positions of the three origins are indicated along with their cognate initiator proteins (Samson et al. 2013). Genetic dependence of the origin upon initiators is indicated by the circular arrows

In contrast to the near universality of Orc1-1, Orc1-3 appears to be restricted to the Sulfolobales, and WhiP is found in both Sulfolobales and Desulfurococcales. This patchy distribution of the initiators suggests that the oriC2 and oriC3 origin/initiator cassettes are relatively recent acquisitions, and it has been proposed that they have been acquired by incorporation of extrachromosomal elements into an ancestral oriC1/Orc1-1 containing chromosome (McGeoch and Bell 2008; Robinson and Bell 2007; Samson and Bell 2014).

Direct evidence for functional incorporation of extrachromosomal origins has been documented in the halophilic euryarchaeon Haloferax volcanii where a lab strain differs from the parental strain by incorporation of a large plasmid, pHV4, into the host chromosome (Hawkins et al. 2013). Importantly, the origin on the plasmid remains functional in its new integrated location. The malleability of the replicon architecture of H. volcanii main chromosome is underscored by the remarkable observation that its replication can be maintained even in the apparent absence of active replication origins. More specifically, experiments to delete all four origins in the lab strain of H. volcanii’s main chromosome were successful, and, very strangely, the resultant “zero origin” strain actually outcompeted the wild type in coculture experiments. The zero origin strain was highly dependent on the RAD51/RecA ortholog, RadA, suggesting a recombination-based mechanism was able to drive genome duplication (Hawkins et al. 2013). How universal this remarkable observation is is not yet clear (Michel and Bernander 2014). When similar experiments were performed in the closely related H. mediterranei, the main chromosome of which normally has three active origins (Fig. 5.2), deletion of the three origins led to activation of a cryptic novel origin of replication (Yang et al. 2015). It is possible that the high ploidy, sexual promiscuity (as manifested by high levels of intraspecies and even interspecies genetic exchange mediated by this organism), and natural competence, i.e., ability to uptake DNA from the media, may be contributory to H. volcanii’s remarkable genomic plasticity (Zerulla et al. 2014; Zerulla and Soppa 2014; Naor et al. 2012).

Fig. 5.2
figure 2

Diagram of the organization of the 2.95 Mb main chromosome of Haloferax mediterranei. The locations of the three active origins in wild-type cells are shown in the left-hand panel. The right-hand panel indicated that, upon deletion of oriC1, oriC2, and oriC3, cell viability is maintained by activation of a novel cryptic origin, oriC4. For details see Yang et al. (2015)

Genetic studies in Sulfolobus islandicus (Sis) reveal that at least one replication origin is essential for viability and that each origin has a unique initiator protein. Intriguingly, this simple binary relationship of origin and initiator is not conserved across the Sulfolobus genus. Studies in Sulfolobus solfataricus (Sso) have revealed that oriC2 in that species is bound by both Orc1-1 and Orc1-3. While the genetic dependence of this origin on both initiators has not been tested, a range of chromatin immunoprecipitation and biochemical and structural studies have demonstrated that this origin is bound by both Orc1-1 and Orc1-3 (Robinson et al. 2004; Dueber et al. 2007; Dueber et al. 2011). The Orc1-1 and Orc1-3 binding sites at this origin are immediately adjacent, and the two proteins have a 360 Å2 protein-protein interface (Dueber et al. 2007). A biochemical comparison of Orc1-1 and oriC2 between S. islandicus and S. solfataricus revealed that both origin sequence and protein sequence have evolved to allow the binding of S. solfataricus Orc1-1 to oriC2 in that species (Samson et al. 2013). This enhanced complexity of origin specification may give insight into the evolutionary transitions that drove the evolution of the multi-subunit present-day ORC complex found in eukaryotes.

5.4 Orc1 Protein Structure and Function

The structural studies of Sulfolobus Orc1-1 and Orc1-3 bound to oriC2, in conjunction with the work from the Wigley lab on Aeropyrum pernix Orc1-1 bound to its cognate oriC1, revealed some general principles of Orc1 protein/DNA interactions (Dueber et al. 2007; Gaudier et al. 2007). The archaeal Orc1 proteins are approximately 43 kDa in size and possess an N-terminal AAA+ domain and a C-terminal winged-helix (wH) DNA-binding domain (Fig. 5.3). While mutational studies had demonstrated the importance of the wH domain in DNA binding, the structural studies revealed that the AAA+ domain also contacted the DNA (Gaudier et al. 2007; Robinson et al. 2004; Dueber et al. 2007; Dueber et al. 2011). The contact between the AAA+ domain and DNA is mediated by a signature embellishment to the classical AAA+ fold found in the initiator clade of AAA+ proteins, termed the initiator-specific motif (ISM). Thus, the orc1 proteins make extended bipartite interactions with the origin DNA (Fig. 5.3). It had been demonstrated that Orc1-1 bound to conserved sequence elements, termed ORBs, at oriC1 (Robinson et al. 2004). ORB elements contain a dyad symmetric element flanked on one side only by a string of G-C base pairs. The wH domain recognizes the dyad element, and the G-string interacts with the ISM (Fig. 5.3). Despite the presence of the dyad element, only a single Orc1-1 molecule binds per ORB element. The structural studies revealed that binding of Orc1-1 substantially distorts and underwinds the DNA to the extent that a second Orc1-1 molecule is unable to recognize the symmetry-related binding site (Gaudier et al. 2007). The preferred orientation of Orc1-1 on an ORB element is presumably defined by the unique ISM-G-string interaction.

Fig. 5.3
figure 3

Structure of the Orc1 proteins. The upper panel is a linear representation of the protein. The N-terminal two-thirds are a AAA+ domain, and the positions of the Walker A (WA), Walker B (WB), and Sensor 2 (S2) motifs are indicated. The ISM is the signature initiator-specific motif embellishment to the AAA+ fold found in the initiator clade of AAA+ proteins. MRM indicates the location of the MCM recruitment motif. The C-terminal third of the protein forms a winged-helix (wH) fold. The crystal structure shown below is of Orc1-1 bound to an ORB element (PDB Accession Number 2V1U). The ORB element is shown by a large gray arrow with internal dyad element and G-string element indicated. The orientation of the arrow is the same as that in Fig. 5.5. The wH domain in red interacts with the dyad symmetric element of the DNA. The ISM, in blue, mediates contacts a G-rich element, the so-called G-string. ADP is present in the active site of the AAA+ domain and is shown in magenta. The residues highlighted in cyan have been demonstrated to be essential for recruitment of MCM by Orc1-1 (residue numbering based on the S. islandicus Orc1-1 protein)

Thus, at oriC1, Orc1-1 binds to ORB elements as a monomer. Another key feature in the structural studies was that the active site of the AAA+ ATPase domain was occupied by ADP. As no nucleotide was supplemented during purification and crystallization, this presumably reflects ATP bound during expression in E. coli and hydrolyzed during the expression and purification processes. Biochemical studies have confirmed that ADP is extremely stably bound to Orc1 proteins. Indeed, protein denaturation and extensive and subsequent re-folding are required to obtain nucleotide-free protein with which to perform ATPase studies (Grainge et al. 2006; Samson et al. 2013). Such studies have revealed that Orc1-1 undergoes a single-turnover ATP hydrolysis event leaving ADP stably bound in the active site. While bacterial DnaA is also active in its ATP-bound state, this activation is manifested in a fundamentally distinct manner from that of Orc1-1. ATP facilitates multimerization of DnaA, ultimately resulting in direct remodeling and melting of the origin DNA (see Bleichert et al. 2017 for a review). In contrast, Orc1-1 remains monomeric when ATP bound and undergoes a subtle conformational change (Samson et al. 2013) that facilitates interaction with MCM, as described below.

Studies using mutated versions of Orc1-1 in vivo and in vitro have revealed that stabilization of the ATP-bound form of the protein by substitution of the so-called Walker B glutamic acid residue by alanine results in a highly active form of the protein (Samson et al. 2013; Samson et al. 2016). In contrast, the ADP-bound form of Orc1-1 is inactive in MCM loading in vitro. On the biochemical level, ATP binding did not alter either the affinity or stoichiometry of Orc1-1 binding to DNA. Rather, ATP binding simply induced a modest conformational change in the protein, as detected by analytical ultracentrifugation and protease sensitivity assays. Despite these modest changes, the constitutively ATP-bound form of the protein was far more active in vitro than the ADP form (Samson et al. 2013). Thus, it appears that ATP binding and not hydrolysis is required for Orc1-1 function. Importantly, expression of the Walker B mutant form of the protein in vivo resulted in an overreplication phenotype, suggesting that ATP hydrolysis serves as an off switch. In this regard, it is significant that the orc1-1 gene shows cell cycle-dependent regulation of its expression with transcript levels highest in cells about to enter G1 (Samson et al. 2013). Thus, the cell cycle dependence of orc1-1 expression, coupled with the single-turnover ATP hydrolysis activity, indicates that Orc1-1 is acting as a molecular switch, permitting MCM recruitment in the Orc1-1•ATP state and inhibiting it in the Orc1-1•ADP state. Such a binary switch behavior is likely important for ensuring once-per-cell cycle regulation of origin activity (Fig. 5.4). The timing of expression of the initiator protein gene thus helps define a permissive window for initiator function. As mentioned above, oriC2 fires a few minutes later in the cell cycle than does oriC1. This is reflected in the later peak of transcription of the Orc1-3 mRNA, relative to that for Orc1-1 (Samson et al. 2013). How the ADP-bound form of the initiator is removed from the origin at the end of the cell cycle is currently unknown. Possible explanations include an ATP exchange factor or, perhaps more likely given the new wave of orc1-1 transcription, targeted destruction of Orc1-1•ADP at cell division.

Fig. 5.4
figure 4

Cartoon of the Sulfolobus cell cycle. The red and black spheres labeled ATP and ADP represent the nucleotide status of Orc1 proteins. A pulse of transcription of orc1-1 at the time of cell division will produce Orc1-1 associated with ATP; subsequently ATP will be hydrolyzed to ADP in a single-turnover event. ADP will thus remain stably associated with Orc1-1 for the rest of the cell cycle

5.5 MCM Recruitment to Archaeal Replication Origins

As alluded to above, Orc1-1 is able to recruit MCM to oriC1 in a defined reaction using recombinant proteins purified from E. coli (Samson et al. 2016). These experiments reveal that, in addition to Orc1-1 sharing sequence homology with Orc1 and Cdc6 of eukaryotes, Orc1-1 also shares Orc1 and Cdc6’s respective functions of origin binding and helicase recruitment. Orc1-1•ATP was shown to contact MCM’s C-terminal wH domain via a conserved motif in the lid domain of the AAA+ domain (the MRM – MCM recruitment motif; see Fig. 5.3). The basis of the ATP dependence of Orc1-1’s functionality was ascribed to the Sensor 2 motif. This conserved arginine residue has the capacity to coordinate the gamma phosphate of ATP and in doing so modulate the relative positions of the two subdomains of the AAA+ module. Importantly, mutation of the Sensor 2 residue led to a protein that bound ATP but had substantially reduced ATPase activity. However, unlike the Walker B mutant that has similar ATPase-null behavior, the Sensor 2 mutant Orc1-1 was unable to recruit MCM to the origin in vitro and did not support origin firing in vivo. Thus, the Sensor 2 residue may act to transduce the information of the nucleotide status of Orc1-1 to the conformation of the MCM recruitment site (Samson et al. 2016).

5.6 Active Loading or Passive Recruitment of MCM?

Classical views of the MCM helicase portray it as a ring-shaped hexamer (Costa and Onesti 2009). However, structural studies of both eukaryotic MCM2-7 and archaeal MCM have revealed a range of conformations. With regard to the archaeal MCMs, single and double hexamers and heptamers have been described, as have open-ring and even filamentous forms of the protein (Chen et al. 2005; Pape et al. 2003; Slaymaker et al. 2013; Samson and Bell 2016; Samson et al. 2016). There has been considerable debate about how the MCM ring might be opened to allow loading onto DNA (Yardimci and Walter 2014; Sakakibara et al. 2009). With regard to the archaeal protein, a notable electron microscopy study demonstrated that simply heating the MCM of Methanothermobacter thermautotrophicum to its normal physiological growth temperature resulted in greater than half of the particles adopting an open-ring conformation (Chen et al. 2005). Similarly, heat treatment of Sulfolobus MCM resulted in substantial elevation of recruitment of MCM by Orc1-1 to oriC1 in vitro (Samson et al. 2016). Thus, based on Orc1-1’s monomeric behavior, single-turnover ATP hydrolysis, activity when ATP bound, switch-off upon ATP hydrolysis, and the thermodynamically favored opening of MCM, we have proposed that Orc1-1 is acting as a conditional platform for MCM recruitment to replication origins. Importantly, oriC1 possesses ORB elements aligned in inverted orientation flanking a ~90 bp AT-rich region. Replication initiation has been mapped at the boundary of this candidate duplex unwinding element, and so it is believed that two hexamers of MCM are loaded into this region by Orc1-1 bound to the flanking ORB elements (Fig. 5.5).

Fig. 5.5
figure 5

Model of the ATP-dependent recruitment of MCM by Orc1 proteins. ATP-bound Orc1-1 associates with inverted ORB elements at oriC1. As illustrated in Fig. 5.3, Orc1-1 binds to ORB elements as a monomer with a defined polarity – the AAA+ module contacting a G-rich element and the wH domain binding a short inverted repeat. The region between the inverted ORB elements, colored in blue, is highly AT rich. The MRM is positioned such that it can interact with MCM, leading to MCM’s recruitment to the origin with both hexamers encircling double-stranded DNA. Subsequent hydrolysis of ATP to ADP repositions the MRM (shown in black in the “off” state), preventing further rounds of MCM recruitment

5.7 The Archaeal CMG Complex

The molecular basis of how initial DNA unwinding at replication origins is effected remains unknown at this time in both archaeal and eukaryotic systems. In eukaryotes, it is well established that the ultimate activation of the MCM helicase is tightly regulated and involves the facilitated recruitment of Cdc45 and GINS to form an active helicase assembly, termed CMG, that is capable of driving replication fork progression (Bell and Labib 2016).

Eukaryotic GINS is composed of four distinct subunits, Psf1, Psf2, Psf3, and Sld5 (Labib and Gambus 2007; MacNeill 2010). The subunits fall into two classes, related to each other by circular permutation. Psf2 and Psf3 have a domain order BA with a beta-strand domain followed by an alpha-helical domain. In Psf1 and Sld5, the order of the domains is switched to AB. The archaeal orthologs were initially identified by virtue of their ability to interact with the N-terminal domains of MCM in a yeast two-hybrid screen. The first archaeal GINS ortholog identified was shown to be related to both Psf2 and Psf3 and was thus named Gins23 (Marinsek et al. 2006). Interestingly, the gins23 gene is encoded within a bi-cistronic operon with mcm. Biochemical studies revealed that Gins23 co-purified with another small protein that was revealed to be related to Psf1 and Sld5 and thus named Gins15. The archaeal GINS assembly was shown to be a tetramer, containing two copies each of Gins15 and Gins23 (Marinsek et al. 2006). While the initial work was performed in Sulfolobus, the archaeal GINS complex is now known to be conserved across the archaeal domain of life (MacNeill 2010; Oyama et al. 2011; Yoshimochi et al. 2008; Oyama et al. 2016). During the biochemical isolation of Sulfolobus GINS, a further polypeptide co-purified over eight steps and was identified as being related to the DNA-binding fold of the RecJ superfamily of proteins, leading to its initial name of RecJdbh (Marinsek et al. 2006). Subsequent work has revealed an unambiguous relationship between RecJ and eukaryotic Cdc45, and so RecJdbh has been renamed as Cdc45 (Sanchez-Pulido and Ponting 2011; Xu et al. 2016). Interestingly, Cdc45-related proteins have been identified across the archaeal domain of life but appear phylogenetically diverse (Makarova et al. 2012). One such protein, termed GAN, has been shown to be capable of association with GINS in the organism Thermococcus kodakarensis and, intriguingly, appears to be active as a nuclease (Li et al. 2011; Oyama et al. 2016). Recent structural studies have confirmed the GAN•GINS interaction and revealed the basis of the interaction between the GAN and the C-terminal domain of Gins15 (Oyama et al. 2016). Notably, in eukaryotes, an analogous interaction is observed between Psf1’s CTD and Cdc45 (Costa et al. 2011).

In Sulfolobus, Cdc45 appears to be very tightly associated with GINS as evidenced by their co-purification over multiple steps (Marinsek et al. 2006). Furthermore, experiments with recombinant GINS and Cdc45 have revealed that the Cdc45•GINS complex (termed CG) is resistant to up to 8 M urea (Xu et al. 2016). Chromatin immunoprecipitation experiments have demonstrated that Cdc45 (and by inference, GINS) associates with MCM at replication origins and proceeds with the helicase during DNA synthesis. At the biochemical level, association of CG with MCM leads to a robust stimulation of helicase activity. Importantly, neither Cdc45 nor GINS when individually added to MCM results in detectable stimulation of helicase activity (Xu et al. 2016). While this latter observation agrees with initial reports that Sulfolobus GINS did not stimulate MCM’s helicase activity (Marinsek et al. 2006), a report from the Huang laboratory has suggested that Sulfolobus GINS alone could stimulate MCM (Lang and Huang 2015).

One important difference between the archaeal and eukaryotic Cdc45 and GINS association lies in the composition of the assembly. While both eukaryotic Psf1 and Sld5 possess the AB domain organization, only Psf1 interacts with Cdc45 (Costa et al. 2011). This enforces a stoichiometry of one Cdc45 per GINS complex. In contrast, in the archaeal GINS, two identical copies of Gins15 are present, thus conferring the potential to interact with two Cdc45 molecules per GINS complex. Native electrospray ionization mass spectrometry experiments on the reconstituted Sulfolobus CG complex revealed that this was indeed the case, revealing a mass compatible with two copies each of Cdc45, Gins15, and Gins23 (Xu et al. 2016). While it has not been directly determined, it seems likely that this organization will also apply to the euryarchaeal Thermococcus GINS•GAN assembly (Oyama et al. 2016). Although this observation suggests a distinct difference between archaeal and eukaryotic CMG, hidden Markov modeling of the predicted structure of Sulfolobus Cdc45 revealed a hitherto undocumented similarity with an unanticipated region of eukaryotic Cdc45 (Xu et al. 2016). More specifically, the RecJ fold of eukaryotic Cdc45 is interrupted by a so-called CID domain (Simon et al. 2016). Surprisingly, Sulfolobus Cdc45 was predicted to form a structure related to this CID domain. As it had already been documented that Sulfolobus Cdc45 has similarities to the RecJ fold, this observation suggests that eukaryotic Cdc45 may have arisen via a gene duplication and internal fusion event, yielding a Russian doll-like organization (Fig. 5.6a). Thus, eukaryotic Cdc45 can be viewed as a pseudodimer when compared to its archaeal antecedents.

Fig. 5.6
figure 6

The archaeal CMG complex. (a) Relationship between bacterial RecJ and archaeal and eukaryotic Cdc45. The Sulfolobus Cdc45 corresponds to the core fold of RecJ – comprised of the DHH and DHHA1 domains. Eukaryotic Cdc45 has these two domains separated by the “CID” domain. Hidden Markov modeling revealed that the CID may have evolved from a partial copy of a core RecJ fold. See Xu et al. (2016) for details. (b) Speculative model for the architecture of the archaeal CMG complex. Gins23 and Gins15 are shown in gray and blue, respectively. Their beta-strand-rich domains are shown as arrows and their alpha-helical domains as rectangles. Gins15 and Gins23 form a 2:2 complex. Further, Gins15 interacts with Cdc45, and Gins 23 interacts with MCM. An open-ring form of MCM, such as that loaded on the replication origins, is depicted

Electron microscopy studies of the eukaryotic CMG complex reveal that GINS and Cdc45 interact over the interface between MCM2 and MCM5 subunits (Costa et al. 2011). This interface serves as a gate in the MCM ring, and elegant cross-linking studies have revealed that the ability of this gate to open is key to loading eukaryotic MCM(2-7) at replication origins (Samel et al. 2014). The innate asymmetry of the eukaryotic heterohexameric MCM(2-7) makes it easy to understand how the location and stoichiometry of Cdc45 and GINS association are imposed. This contrasts with the situation in archaea where the MCM is composed of six identical subunits. However, the available data indicate that MCM is recruited to origins in an open-ring form (Samson et al. 2016). It is possible that the nature of the opening between MCM subunits is such that it favors association of CG with that locus on the MCM complex (Fig. 5.6b). It may be significant that CG interacts with MCM’s N-terminal domains via the Gins23 subunit (Marinsek et al. 2006). It is conceivable that the presence of two identical MCM-interaction interfaces on archaeal CG favors interactions between MCM N-terminal domains juxtaposed across the opening in the MCM ring.

In eukaryotes, the sequential and regulated associations of first Cdc45 and then GINS with loaded MCM are pivotal events in the control of the initiation of DNA replication (Siddiqui et al. 2013; Tanaka and Araki 2013; Bell and Labib 2016). Interestingly, the so-called firing factors that facilitate this process (e.g., Sld2, Sld3, Sld7, Dpb11) are eukaryotic innovations with no discernable homologs in the archaea. Furthermore, the CDK and DDK kinases that in turn govern the behavior of the firing factors are also absent from archaea. The tight association of Cdc45 and GINS in archaeal cell extracts might imply that these factors interact en bloc with origin-associated MCM, leading to activation of MCM’s helicase activity. Whether this step in archaeal DNA replication initiation is subject to regulatory control is currently unknown. However, in species such as Sulfolobus where multiple replication origins are coordinately regulated to trigger a single initiation event per cell cycle, it is very tempting to speculate that MCM activation by CG could be a key and committing step in regulating replication initiation.