Introduction

Microbes have great potential to enrich our lives by providing medicines, feedstocks, fuels, and other products with low cost and environmental footprint. However, while ideal microbial production systems have specific design characteristics, microbial evolution for fitness in a given environment is often at odds with these requirements. Therefore, substantial engineering of microbes is needed to fully realize the potential of microbes in the bio-economy. While recombinant DNA technologies have been applied to microorganisms for decades, this process is inherently slow and usually targets only one or a few genes. Furthermore, some microorganisms have thus far proved resistant to genetic manipulation.

To accelerate the pace of microbial genome engineering, a recently developed strategy has been described for acquisition of entire bacterial or eukaryotic chromosomes in model organisms, alteration using the well-established genetic systems in these model organisms, and installation of the manipulated chromosomes back to the organism of interest. This cycle of acquire, alter, and install (AAI) represents a strategy for multi-locus manipulation of microbial chromosomes, especially in slow-growing or difficult-to-manipulate organisms. This review outlines the importance of the host organisms Bacillus subtilis, Escherichia coli, and Saccharomyces cerevisiae (Fig. 1) to the AAI cycle. The AAI model has been fully developed for Mycoplasma mycoides and is currently in development for a variety of other microbes. While many outstanding reviews have summarized recent advances in genome editing tools (Esvelt and Wang 2013; Lee et al. 2013; Gibson 2014), our purpose here is to describe platform organisms and tools used for genome acquisition and manipulation and to outline how these systems and tools can be used in combination to perform synthetic biology at the scale of whole chromosomes.

Fig. 1
figure 1

Manipulation of DNA using various platform host organisms. Selection of a platform host will be dependent on the size of the DNA, GC content, and potential toxicity. DNA can be moved between hosts by conjugation (C), electroporation (E), fusion of host organisms (F), natural transformation (N), polyethylene glycol-mediated transformation (P), or genome transplantation (T). Asterisk indicates that natural transformation was demonstrated for B. subtilis

History of chromosome acquisition

The first step in the AAI cycle is to clone or assemble the entire chromosome or large fragments of the chromosome in a suitable platform host. In this review, we focus on acquisition and alteration technologies in B. subtilis, E. coli, and S. cerevisiae. Cloning the entire genome of one organism within another organism was first demonstrated by Itaya and colleagues (Itaya et al. 2005). In this approach, the 3.5-Mb Synechocystis sp. PCC 6803 genome was introduced through a multi-step, iterative process and maintained at two loci within the B. subtilis genome. Later, a variation of this approach was used to demonstrate the assembly of the entire mouse mitochondrion and rice chloroplast genomes in B. subtilis (Itaya et al. 2008).

Using yeast to clone whole bacterial genomes as centromeric plasmids was a breakthrough that enabled the completion of genome acquisition in a single step (Benders et al. 2010) (Table 1). Several variations on this theme have been implemented to clone whole bacterial chromosomes in yeast. A commonly used approach involves first the insertion of a yeast vector into the target bacterial genome while the genome is still inside the bacterial cell (Benders et al. 2010; Karas et al. 2013a). The entire genome can then be isolated intact from the bacterial cell and introduced into yeast where it is maintained as a yeast artificial chromosome (YAC). Many DNA modification methods have been developed in yeast for precise site-specific substitution (Noskov et al. 2010; Storici et al. 2001), deletion of sequences (Noskov et al. 2010), insertion of sequences, and clustering of altered loci of various types via meiotic recombination (Suzuki et al. 2011; Pinel et al. 2011).

Table 1 Summary of prokaryotic and eukaryotic chromosomes cloned in S. cerevisiae

Cloning of the complete bacterial genome in yeast was first demonstrated for the 0.6-Mb genome of Mycoplasma genitalium (Benders et al. 2010; Gibson et al. 2008a, b). This organism has the smallest genome among all free-living bacteria known to date. Additional genomes including that of Mycoplasma pneumonia (0.8 Mb) (Benders et al. 2010), M. mycoides (1.1 Mb) (Gibson et al. 2010a; Lartigue et al. 2009), and Mycoplasma capricolum (1.0 Mb) (Karas et al. 2013a) were also cloned in yeast. These organisms were first selected for genome cloning because of their small genome sizes and also because of the expectation that the genomes were unlikely to produce toxic gene products in yeast as a result of their alternative genetic code. The mycoplasmas use a genetic code with the UGA codon translated as tryptophan. Moreover, this codon is used more frequently in these organisms than the other codon UGG for tryptophan. In the standard genetic code used by all three of the platform hosts described in this review, the UGA codon is reserved for translational termination. Therefore, transcripts derived from accidental expression of mycoplasma genes would presumably not be properly translated in yeast due to premature UGA codons (i.e., “stop” codons) and therefore would likely be harmless in yeast.

Other bacterial chromosomes with the standard genetic code have been successfully acquired in yeast. The 1.7-Mb genome of the cyanobacterium Prochlorococcus marinus MED4 (Tagwerker et al. 2012) and the 1.8-Mb Haemophilus influenzae genome (Karas et al. 2013a) were each cloned in yeast without any problem. However, cloning the 1.5-Mb genome of Acholeplasma laidlawii was hampered by what was found to be a single gene that was toxic to yeast (Karas et al. 2012). To identify this toxic gene and to allow the A. laidlawii chromosome to be cloned, the genome was first cloned as several large fragments to identify the region that was toxic to yeast using the methods described below. This region was further subdivided into several steps to identify the problematic gene. The A. laidlawii genome excluding the toxic gene was then cloned in yeast.

Whole-genome acquisition for Synechococcus elongatus was challenging both for its larger size (2.7 Mb) and higher GC content (discussed below), but this genome was separately cloned as 30 overlapping pieces to establish a library of 30 yeast strains each carrying a large, ∼112-kb genomic fragment (Noskov et al. 2012). This study suggested that no gene in this organism was toxic to yeast. The likely reason for the difficulty in cloning the whole genome is that the average GC content of the S. elongatus chromosome is higher than that in yeast where frequent AT-rich replication origins are required. Therefore, a sequence that can act as a replication origin is rare in the S. elongatus genome. In agreement with this possibility, five of the initial 30 genomic fragments were successfully assembled when three additional yeast replication origins were introduced (Noskov et al. 2012). This approach is expected to facilitate cloning of GC-rich chromosomes and chromosomal fragments.

In addition to bacterial chromosomes, eukaryotic chromosomes have also been successfully acquired in model organisms. The mouse mitochondrial genome was assembled in both B. subtilis and yeast (Gibson et al. 2010b; Itaya et al. 2008). The chloroplast genomes from rice and Chlamydomonas reinhardtii were assembled in B. subtilis and yeast, respectively (Itaya et al. 2008; O’Neill et al. 2012). Entire nuclear chromosomes from eukaryotic diatom algae have been successfully assembled and maintained in yeast and also in E. coli (Karas et al. 2013b). An entire yeast nuclear chromosome has been replaced with synthetic DNA fragments through an iterative assembly process (Annaluru et al. 2014).

As the history outlined above describes, both yeast and B. subtilis have been useful platforms for chromosomal acquisition. E. coli has also played an important role in facilitating genome assembly. In the following section, we describe tools and techniques used to acquire and alter genomes in each of these platform organisms. We also highlight how the three platform hosts can be used in combination to perform parts of the AAI experimental cycle (Fig. 1).

Platform organisms for chromosome acquisition and modification

B. subtilis

B. subtilis strain Marburg 168 (Bsu168) has been proven to be an important system for genome acquisition as well as alteration. The utility of the B. subtilis system is enhanced by very efficient natural transformation and homologous recombination (Johnston et al. 2014). The Bsu168-based system has been shown to take up DNA as large as 100 kb (Kaneko and Itaya 2010). DNA fragments taken up can then be replicated as a plasmid if a replication origin sequence is present or can be integrated into the genome by homologous recombination.

Itaya et al. (2000) introduced a new concept of cloning large DNA fragments by incorporating them into the Bsu168 chromosome which effectively serves as a cloning vector (also known as the Bacillus GenoMe (BGM) or BGM vector). One of the techniques to introduce DNA to BGM is called “domino cloning” (Itaya et al. 2008). This method begins by inserting a plasmid sequence into the BGM to serve as a “landing pad” for the subsequent DNA integration. DNA fragments to be inserted into the BGM vector are then introduced into Bsu168 to recombine into the chromosome at the landing pad. The first fragment is designed and built to contain homology to the second fragment at the end so that the second fragment can be introduced downstream of the first fragment and upstream of the 3′ end of the landing pad. Fragment introduction is also associated with an exchange of antibiotic resistance markers. Therefore, the sequential introduction of fragments and the selection of alternative markers lead to the elongation of the region cloned within the BGM vector. The technique was applied to clone the 134.5-kb rice chloroplast genome (Itaya et al. 2008). One drawback is that the assembly is only stepwise and does not permit combination of multiple fragments in a single step as can be performed in yeast (Gibson et al. 2008b).

A second approach to integrate DNA into the BGM vector is a variation of the domino method called the “inchworm elongation” (IWe) method (Itaya et al. 2003, 2005). In this method, two unique, but adjacent landing pad sequences are installed in the BGM vector. The first DNA fragment to be introduced is flanked on each side by one of the two landing pad sequences, and the incoming fragment is recombined between the two landing pads. To add a second fragment, a third landing pad sequence is installed downstream of landing pad sequence 2, and the incoming second fragment is flanked by sequences homologous to landing pads 2 and 3. This allows recombination at landing pads 2 and 3 to insert the second fragment. The resulting BGM vector has landing pad 1 followed by fragment 1, landing pad 2 followed by fragment 2, and landing pad 3. This process can be continued progressively, but each incoming fragment must be flanked by a unique landing pad sequence. An impressive display of the inchworm elongation method was to integrate 3500 kb of the Synechocystis PCC 6803 genome which represents most of its genome into the BGM vector (Itaya et al. 2005). The complete genome of Synechocystis could not be cloned due to toxicity of the ribosome genes (Itaya et al. 2005).

Multi-fragment assembly was developed for B. subtilis in a method called ordered gene assembly in Bsu168 (OGAB) (Tsuge et al. 2003, 2007; Nishizaki et al. 2007). This method allows for the isolation of plasmids containing at least six pre-assembled fragments. The main advantage of this method is that linear DNA fragments can be assembled into a replicative plasmid in vitro and transferred to B. subtilis via natural transformation. The disadvantage is that the assembly is not always seamless and requires a restriction digest and ligation step before transformation.

E. coli

E. coli became one of the best studied prokaryotic microbes because of easy laboratory propagation, short generation time, and available genetic tools for DNA manipulation. As discussed below, many of these methods directly support strategies to assemble entire chromosomes using other platform hosts. It has also been extensively used as a host for maintaining large foreign DNA fragments as bacterial artificial chromosomes (BACs). However, exogenous genes are sometimes toxic in E. coli. For example, Gibson et al. (2010a) described the difficulty in cloning and maintaining M. mycoides DNA fragments that were larger than 10 kb in E. coli when the fragments had a GC content of 24 %. One of the ∼100-kb S. elongatus plasmids cloned in yeast could not be moved to E. coli. While the whole genome of H. influenzae could be cloned in yeast (Karas et al. 2013a), this was not possible in E. coli (Holt et al. 2007). When 246,045 different genes from 79 prokaryotic genomes were tested for maintenance in E. coli, certain sets of genes could never be recovered (Sorek et al. 2007). This is likely due to the widespread transcription of the foreign DNA in E. coli (Warren et al. 2008). Approximately half of all H. influenzae genes cloned on BACs in E. coli were transcribed. This proportion of transcribed foreign DNA was smaller for the cloned Pseudomonas aeruginosa DNA, and the transcription of cloned human DNA by E. coli was only occasional. Thus, the widespread transcription of many cloned prokaryotic genes may be the cause of the lack of success for bacterial chromosome cloning in E. coli.

Despite many examples of apparently toxic exogenous genes negatively impacting cloning success, large fragments from other species can be cloned and maintained in E. coli as long as the sequences agree with the host such as the 454-kb fragment of S. elongatus PCC 7942 (Noskov et al. 2012) and each of two complete eukaryotic chromosomes from Phaeodactylum tricornutum which were approximately 500 kb (Karas et al. 2013b). These large plasmids were conveniently introduced into E. coli using standard electroporation. While many powerful techniques are available for engineering the introduced DNA molecules in E. coli, two related techniques deserve to be mentioned. First, we review the lambda Red as a useful method to support chromosome acquisition strategies, and second, we consider the multiplex automated genome engineering (MAGE)/conjugative assembly genome engineering (CAGE), which demonstrates the great power of using the acquire-alter-install cycle for modification of a microbial chromosome in E. coli.

Lambda Red recombineering

Cloned chromosomes or large fragments of chromosomes can be efficiently manipulated using the lambda Red system. Linear DNA fragments created by polymerase chain reaction (PCR) can be efficiently used to manipulate the cloned chromosomes in E. coli. This system uses three proteins from the lambda phage genome: Gam, Bet, and Exo (Datsenko and Wanner 2000; Yu et al. 2000). The Gam protein protects the linear DNA from nuclease digestion, while the Exo and Bet proteins create and protect ssDNA at the ends of the linear DNA fragment, respectively, to facilitate recombination with the target DNA. Lambda Red techniques have been recently updated with the discovery that deletion of several E. coli nucleases and modifications to helicase/primase increases the efficiency of obtaining the targeted modifications (Lajoie et al. 2012; Mosberg et al. 2012). The Lambda Red has been proven to be an invaluable tool to add or remove sequences from cloned chromosomes or near-chromosome-sized fragments (Karas et al. 2013b; Noskov et al. 2012; Yang et al. 1997). As will be described below, in the yeast section, the lambda Red technique was used to add sequences to facilitate an assembly of high-GC content DNA in yeast.

MAGE/CAGE

Typically, only one change at a time is made with lambda Red recombineering, but multiple simultaneous changes can be made using the related technique called MAGE (Wang et al. 2009). This technique uses pools of oligonucleotides, each containing a modification to the genome as a substitution, deletion, or insertion of a few nucleotides at a time, to transform and modify the recipient E. coli. A similar process for oligonucleotide-based multiplex manipulation in yeast is called YOGE (DiCarlo et al. 2013). Simultaneous changes can be made throughout the genome of the host, and the probability of success for obtaining a clone with multiple simultaneous changes in its genome increases with the number of rounds of serial transformation of the cell population with the oligonucleotide pool (Wang et al. 2009). MAGE has been effectively coupled with CAGE to allow hierarchical assembly to be applied to genome-scale editing (Isaacs et al. 2011). For example, changes to multiple parts of the chromosome could be initiated in parallel and then combined hierarchically to yield a final molecule with changes throughout the chromosome. While MAGE and CAGE have not been applied to cloned chromosomes to our knowledge, the potential to these multiplex engineering tools to rapidly alter a cloned chromosome may be a useful alternative to partial or complete synthesis.

S. cerevisiae

The ability of S. cerevisiae to host large DNA fragments, efficiently recombine DNA molecules, and largely withstand or avoid detrimental effects of cloned foreign genes is the main reason for selecting yeast for the purpose of cloning and maintaining chromosome-scale DNA. Out of thousands of genes among the eight complete bacterial genomes cloned in yeast, there was only one gene that was toxic to yeast (Karas et al. 2012). In addition, two complete eukaryotic chromosomes were cloned in yeast without any noticeable impact on growth rate (Karas et al. 2013b). Yeast is also compatible with numerous eukaryotic genes as it has been used as a host organism to make YAC libraries for various eukaryotic species.

In addition to toxicity of foreign DNA, another important consideration when using yeast systems to acquire chromosome-scale DNA is the origin of replication. The yeast replication origin (also known as an autonomously replicating sequence (ARS)) is a low-GC region that contains a consensus for binding the origin recognition complex (ORC) (Newlon and Theis 1993). The ARS sequence is commonly included in yeast cloning vectors, but as will be described below, there are occasional reasons for not including this sequence in the vector. When cloning large fragments of low-GC DNA, sequences that function as an ARS are abundant within the insert DNA. These sequences are rarely found in chromosomes with a GC content of over 40 %. Thus, consideration of the GC content of the targeted chromosome is critical to design a proper cloning strategy for yeast. Based on our experience, chromosome-sized DNA can be cloned in a plasmid containing a single ARS when the GC content of the fragment is less than 40 % (Table 2). For DNA with a GC content of greater than 40 %, it is recommended that genomes are first cloned as 100-kb overlapping fragments, followed by addition of yeast replication origins to each fragment (Karas et al. 2013b; Noskov et al. 2012). To add the yeast origins to each fragment, a multi-host approach can be used. Plasmids containing the large DNA fragments can first be moved from yeast to E. coli where yeast origins can be added using the rapid and efficient lambda Red recombineering methods described above. These plasmids can then be re-isolated and returned to yeast for assembly into entire chromosomes. In addition to yeast origins of replication, at least one extra yeast marker should be added to one of the fragments to aid in the final assembly (Karas et al. 2013b). An alternative strategy to clone or assemble high-GC chromosomes would be to add yeast replication origins directly to the microbial chromosome of interest via homologous recombination or transposon insertion prior to chromosome transfer to yeast.

Table 2 Suggested hosts for DNA cloning based on the size of DNA and GC content

Two types of plasmids are commonly used in yeast: the high-copy 2 μ plasmid (Chan et al. 2013) and the low/single-copy centromeric plasmid. Only centromeric plasmids have been used to support large DNA fragments. The design of these plasmids should take into consideration the GC content of the targeted DNA (Fig. 2). For high-GC DNA, an ARS preferably needs to be added every 100 kb within the fragment. For low-GC content (less than 40 %), the plasmid only needs to contain a centromere and selection marker sequences for yeast. Omission of the ARS from the plasmid has the advantage of reducing the background of incorrect, “vector-only” colonies when cloning low-GC chromosomes using the transformation-associated recombination (TAR) cloning techniques described below. Thus, by omitting the ARS from the vector, replication of the vector requires successful recombination to acquire the low-GC chromosomal DNA that is likely to have functional ARSs on it.

Fig. 2
figure 2

Strategies to assemble whole chromosomes in yeast. a High-GC (>40 %) chromosomes should be cloned as 100-kb fragments with sequence overlaps of 200 to 20,000 bp between adjacent fragments. To each fragment, an extra ARS should be added (A2, 3, 4) and one of the middle fragments should contain an additional yeast marker for positive selection (M2). All fragments should be assembled with a vector that contains a centromere (C), an ARS (A1), and a positive selection marker (M1). b Low-GC chromosomes can be cloned directly as a single molecule by recombination with a yeast vector which contains a centromere (C) and a positive selection marker (M1). The second marker (M2*) for negative selection can be added at one end of the vector. For the highest chance of success, the chromosome should be cut, for example, with a restriction enzyme that cuts at the beginning of homology sequence 1 (HS1) and the end of homology sequence 2 (HS2)

A useful genetic technique for cloning bacterial chromosomes in their entirety or as large fragments is TAR cloning (Kouprina and Larionov 2008). In this technique, linear DNA containing the yeast plasmid flanked by sequences of homology to a DNA target is co-transformed into a yeast cell containing the target DNA. Homologous recombination within the yeast results in an assembly of a plasmid with the target DNA cloned. The homologous sequences on the plasmid should have at least 60 bp of homology to the target sequence at each end, but the use of longer sequences can improve the recombination in some cases (Karas et al. 2012; Yamanaka et al. 2014). An additional element that can be added to improve TAR cloning success is a negative selection marker for yeast. For example, the URA3 gene is typically used as a positive selection marker, but it can be used as a negative selection marker when 5-fluoroorotic acid (5-FOA) is added (Boeke et al. 1984). The URA3 gene is added at one end of the vector, and it is designed to be lost upon recombination with the correct fragment (Fig. 2). This recombination event to eliminate the URA3 marker is selected using 5-FOA and helps to remove most of the background that results from yeast vector closing on itself. To clone large fragments or entire chromosomes, DNA should be prepared in agarose plugs as described in the Bio-Rad CHEF manual to prevent shearing. Using this approach, intact DNA up to at least 1.7 Mb has been isolated and cloned (Tagwerker et al. 2012; Karas et al. 2012). In-plug digestion of the DNA at a target site for recombination with the vector will aid in homologous recombination (Karas et al. 2013b). TAR cloning approaches can assist in troubleshooting genome cloning. For example, identifying the toxic gene in A. laidlawii was facilitated by a TAR cloning approach in which large fragments of the genome could be separately cloned to eliminate it as the source of the toxic gene. The genome excluding the toxic gene could then be cloned. While the scheme involving TAR cloning was initially established to handle a single toxic gene, cloning a genome by assembly in yeast is also suitable for removing multiple toxic genes at once.

Due to its ability to accept and maintain large DNA molecules and join them using homologous recombination, yeast has been used as a convenient host for assembling synthetic genomes. To synthesize a genome, thousands of synthetic oligonucleotides based on sequences from a computationally designed genome are assembled first using in vitro reactions. The resulting DNA fragments are introduced into yeast to hierarchically assemble larger fragments and, eventually, the complete genome or genome-sized molecule. Using the design of a mycoplasma species, an artificial genome was generated and transferred to recipient bacterial cells using a method termed genome transplantation (Lartigue et al. 2007) to create the first cell solely controlled by the rebooted, artificial donor genome (Gibson et al. 2010a).

While many transformation methods are available for yeast, the transformation of spheroplasts should be used when cloning whole chromosomes or large fragments in yeast (Hinnen et al. 1978; Karas et al. 2013a; Kouprina and Larionov 2006). This method requires an enzymatic treatment of yeast cells to partially remove yeast cell wall in order to produce transformation-competent, spheroplasted cells. The main advantage of using this method is the ability to assemble many small or large DNA fragments at the same time. Disadvantages include the requirement that spheroplasts must be prepared fresh each time to achieve maximal competency and that transformed spheroplasts must be “pour plated” within the agar medium matrix to allow for cell wall regeneration making it more difficult to recover single colonies (i.e., each colony has to be handpicked). Alternatively, to make this process automated, liquid media can be added on top of the agar 1 day after transformation to allow cells to grow into the liquid (B. J. Karas, unpublished data). Next, the pool of yeast transformants can be diluted and plated to isolate single colonies, which can then be robotically picked.

Yeast spheroplasts can be combined with purified DNA, but they can also be mixed with spheroplasts of other yeast strains or bacterial cells to allow DNA transfer (Karas et al. 2013a, 2014a; Gyuris and Duda 1986). This approach is suitable for transfer of DNA fragments as large as or possibly larger than 1.8 Mb in size into yeast, because the donor cells protect their DNA cargo until there is contact with yeast. In order to transfer chromosomes by direct, cell-to-cell genome transfer, a yeast vector sequence (containing a selectable marker and centromere) needs to be inserted into the donor DNA (e.g., a bacterial chromosome). This can be achieved using any transformation methods available for the donor strain (e.g., natural transformation, conjugation, and transposome). We expect that the Tn5 transposome method is particularly effective for cloning genomes from “wild” species because the loaded transposase appears to either protect the DNA material from restriction or accelerate the integration into the bacterial genome so that restriction is evaded (Karas et al. 2014b).

Cloned chromosomes or chromosome fragments can be easily manipulated using powerful yeast genetic tools. For example any desired DNA region can be deleted via replacement with a cassette that contains a yeast marker flanked by roughly 40 bp of homology at each end to the targeted sequence. Alternatively, seamless deletions can be performed using the tandem repeat coupled with endonuclease cleavage (TREC) method (described below). The most convenient way to introduce such cassettes is to use lithium acetate transformation which does not require generation of spheroplasts (Gietz 2014).

TREC

The technique known as TREC has been specifically developed for engineering of genomes cloned in yeast (Noskov et al. 2010). This technique allows the user to add, delete, or substitute any sequence of interest in yeast and works especially well for editing yeast-maintained bacterial genomes. The TREC technique first uses homologous recombination to insert a CORE cassette (composed of an I-SceI recognition sequence, a galactose-inducible I-SceI gene, and the URA3 cassette) followed by a region of homology into a target site. Insertion of the CORE cassette into the target site is facilitated by 50-bp regions of DNA sequence corresponding to the target site. The homology region that follows the CORE cassette allows for a subsequent recombination with a sequence upstream of the targeted CORE after insertion. After insertion, the I-SceI gene is induced by galactose, leading to a double-stranded break (DSB) at the 5′ end of the CORE cassette. The CORE cassette is then removed by a very efficient homologous recombination event between the region 5′ to the CORE cassette and the homology region inserted with the CORE cassette resulting in the seamless desired change.

TREC was designed to address the problem of sequence instability during manipulation of cloned genomes in yeast (Noskov et al. 2010). For large, artificial chromosomes cloned in yeast, standard gene replacement techniques were found not to work well. For example, typical manipulation of chromosomally encoded yeast genes involves a two-step process. In the first step, homologous recombination is used to insert the URA3 marker into a target site removing the sequence that is to be altered. In the second step, the URA3 marker is used as a counter-selectable marker in the presence of 5-FOA, and a target piece of DNA containing the desired change is substituted for the URA3 marker. In modifications of chromosomally encoded yeast genes, this second recombination step usually occurs at a high rate, allowing the desired change to be identified efficiently. When the change is to be made in a chromosome-sized plasmid maintained in yeast (e.g. containing a bacterial genome), the efficiency of obtaining the correct alteration drops significantly, likely through spontaneous rearrangement events at repeated sequence sites in the cloned genome. Unlike the yeast chromosomes, the genes between these repeated sequences in the cloned chromosomes are not essential to yeast and their loss may be unfortunately easily selected in the presence of 5-FOA. Because strategies to manipulate cloned chromosomes often involve work across multiple platform organisms, it is important to note that TREC-like approaches have also been developed for E. coli (Tischer et al. 2006; Yu et al. 2008).

Cross-platform chromosome modification tools

Cre/loxP

Phage-derived site-specific recombinases such as Cre/loxP are very useful tools for manipulation of cloned genomes in all three platform strains. For example, loxP sites can be added into cloned genomes or built into synthesized genomes to permit an exchange of large regions of sequence. In a recent demonstration of the power of this technique, a 72-kb synthetic DNA fragment was successfully recombined into the E. coli genome, replacing a 126-kb region creating multiple, non-contiguous changes in the region of this chromosome (Krishnakumar et al. 2014). While in the case described above, the fragments were reinserted into a native chromosome rather than a cloned chromosome, the approach is expected to be as effective on a cloned chromosome or a large fragment of a chromosome. The Cre/loxP system has been used to randomly generate chromosome alterations throughout the synthetically constructed regions of a yeast chromosome (Dymond et al. 2011).

Programmable nucleases

One of the greatest recent advances in molecular biology tools has been the development of sequence-programmable nucleases. Two of these technologies, transcription activator-like effector nucleases (TALENs) and clustered regularly interspersed short palindromic repeats (CRISPRs), have been widely adopted in a variety of hosts where molecular tools and genetic systems were previously lacking (Kim and Kim 2014), but they may also have utility in manipulation of cloned, exogenous chromosomes in E. coli and yeast.

TALENs comprise a fusion of a transcription activator-like effector (TALE) region and a C-terminal FokI nuclease. The sequence specificity of the nuclease is programmed by linking together a series of nearly identical, repeated domains in the TALE region that each binds a different nucleotide. Linking 18–20 of these domains allows for sufficiently specific binding in most complex genomes. Because the FokI nuclease functions as a dimer, a pair of TALENs must be engineered with recognition sequences that are separated by 14–20 bases so that the nucleases can dimerize and cut once both TALENs are bound to their respective target sequences (Sanjana et al. 2012).

Another editing technique at the forefront of genome editing techniques is the CRISPR-Cas9 system. This system works by expressing a nuclease (Cas9) that requires a guide RNA (gRNA) for its sequence specificity. Thus, the specificity of the system is determined by the sequence programmed into the gRNA (Hsu et al. 2014). Because the specificity is so easily programmed, multiplex editing approaches can be undertaken by expressing multiple gRNAs (Cong et al. 2013). While initially criticized for a lack of specificity, newer techniques have improved the fidelity (Ran et al. 2013).

Two applications of these technologies to the acquire-alter-install approach can be easily recognized. First, because both TALENs and CRISPRs have been so widely used in a variety of hosts to promote homologous recombination, they could assist in the insertion of the yeast elements into a specific site or could be used to alter a chromosome of interest to remove toxic genes in the native organism before moving to the cloning platform organism (i.e., yeast, E. coli, or B. subtilis). Second, they could be used in E. coli or yeast to assist in the creation of targeted changes throughout the cloned chromosome. Although this and other genome editing techniques described above have been initially developed for editing native chromosome loci, they are expected to be useful for editing cloned chromosomes.

Installation of cloned and manipulated chromosomes

The whole genome of M. mycoides can be transplanted into the related species M. capricolum to replace the existing genetic “software” and convert the cell to a fully functional M. mycoides (Lartigue et al. 2007) (Fig. 3). This breakthrough technique in combination with genome synthesis produced the first synthetic cell (Gibson et al. 2010a). If this technique could be adopted for other microbes, it would revolutionize microbial genetics and lead to a new generation of designer microbes.

Fig. 3
figure 3

Proposed mechanisms for booting-up (transplantation) of mycoplasma genomes. Donor DNA is mixed with recipient cells resuspended in a CaCl2 solution to which polyethylene glycol (PEG) is added. This results in DNA/recipient cell aggregation which may lead to cell fusion. During this process, some of the donor DNA is inserted into the recipient cells. Next, the PEG solution is removed and the media containing an antibiotic specific to the donor genome are added. Only successful transplants or cells carrying a recombinant genome containing the selection marker for the donor genome will survive

Based on the mycoplasma work and other transformation studies, desired tasks for adaptation of the genome transplantation process to other species should include: (1) development of restriction-deficient recipient strains, (2) development of recombination-deficient recipient strains, (3) improvement of protocols to prepare intact donor genomic DNA, and (4) establishment of conditions (e.g., temperature, polyethylene glycol (PEG) treatment, and cell wall removal) to activate the membrane and/or cell wall of recipient cells for uptake of large DNA.

Genome transplantation is the ultimate way to transfer whole chromosomes, but in the absence of genome transplantation, a stepwise replacement method could be used. For example, Krishnakumar et al. (2014) showed that a 126-kb segment of the E. coli genome can be replaced with a synthetic version which was designed to contain only 72 kb of the original fragment. Large regions containing multiple modifications can also be reinstalled using homologous recombination-based methods such as CAGE for E. coli (Isaacs et al. 2011) and stepwise replacement with synthetic fragments in yeast (Annaluru et al. 2014). Following these approaches, entire microbial genomes could be altered, minimized, or modularized as desired. With the recent advancement in programmable nucleases, replacement of large chromosomal regions may become even easier.

Future directions

The foundations of a microbial strain improvement experimental cycle have been established that involves the capture of the chromosome of interest into a platform host organism, alteration or engineering of the chromosome in the platform organism, and re-installation of the modified chromosome in the microbe of interest (Fig. 1). This cycle can be simplified to AAI. All of the steps of the AAI cycle have been demonstrated, and by far, the greatest progress in recent years has occurred in the acquisition step. The importance of yeast in this step is clearly demonstrated by the growing number of intact bacterial and even eukaryotic chromosomes that can be maintained. While E. coli does not currently have any demonstrated ability to maintain entire bacterial chromosomes, the convenience and speed of working with E. coli suggests that efforts should continue to develop chromosome cloning methods for this organism. While prokaryotic chromosomes are often toxic to E. coli due to undesired transcription, whole eukaryotic chromosomes could be cloned in E. coli, likely because of the very different transcriptional systems (Karas et al. 2013b). Developing methods to clone and maintain megabase-scale plasmids in E. coli would greatly facilitate chromosome acquisition technologies. Since many bacteria maintain megabase-sized plasmids (Barnett et al. 2001), this could be possible. We have successfully maintained 454-kb plasmids in E. coli (Noskov et al. 2012), and to our knowledge, the upper bound of plasmid maintenance in E. coli has not yet been determined. Chromosome acquisition technologies will also be positively affected by improvements in DNA synthesis technologies. Although the chromosome acquisition step is not dependent on genome synthesis, the expectation of dramatically cheaper DNA synthesis in the next 5–10 years will permit the more widespread use of genome synthesis.

Great opportunities remain to expand the methodologies and range of recipients for the re-installation step. This is especially true for eukaryotic organisms that methylate DNA. While DNA is the software of life, some critical instructions such as DNA methylation are not necessarily included. Improvement in the custom editing of epigenetic features using TALEN or CRISPR-mediated methylation or other programmable enzymatic tools could mimic a eukaryotic DNA methylation pattern at sufficient resolution to achieve chromosome function during re-installation.