Keywords

5.1 Genome Plasticity, Horizontal Gene Transfer and Evolution of the Bacterial Genome

Bacteria represent a diverse group of ubiquitous microbial organisms capable of surviving and tolerating a vast range of environments. Bacterial genomes are flexible or plastic, or adaptive entities resulting from the complex and dynamic nature of the bacterial chromosomes (Darmon and Leach 2014). The phenomenon of genome evolution involves different processes through which the content and arrangement of a species’ genetic information modify over time. The plasticity and evolution of the bacterial genome are facilitated by various processes, including genetic mutations or genome reshuffling through deletions, duplications, inversions, translocations, and horizontal gene transfer (HGT) (shown in Fig. 5.1). Of these, HGT is unarguably one of the major creative forces that drive bacterial evolution, molding bacterial species’ gene repertoires and providing and retaining population diversity. HGT takes place across the bacterial genome and causes the generation of incredibly different adaptations (Ochman et al. 2000). The phenomenon of HGT has cropped up significant ecological differences among closely related bacterial strains belonging to a single “species” taxon (Welch et al. 2002). Bacteria get benefitted from HGT due to their adaptation to various environments through colonization in new niches, thus allowing them to steer evolution in “quantum leaps” (Hacker and Carniel 2001). HGT is generally associated with acquiring DNA fragments that can move from one cell to the other or within a genome, or both (Bellanger et al. 2014). HGT in bacteria is brought about by transformation, transduction, and conjugation (Fig. 5.2), transferring genes across potentially distant bacterial lineages (Bazin et al. 2020). HGT is facilitated by mobile genetic components like conjugative plasmids, transposons, insertion elements, bacteriophages, and genomic islands or GIs (Hacker and Carniel 2001; Milkman 2004). Mobile genetic elements can encode different factors relating to drug resistance, bacterial pathogenicity, production of bacteriocins, and specific metabolic functions associated with the breakdown of xenobiotics chemicals, etc.

Fig. 5.1
A graphical representation of the processes of genome evolution. The components are point mutations, genome rearrangement, and horizontal gene transfer.

Graphical representation of the major processes facilitating the plasticity and evolution of the bacterial genome like point mutations, genome rearrangements, and horizontal gene transfer

Fig. 5.2
An illustration depicts the horizontal gene transfer in a bacterial cell. It depicts a bacteriophage on the cell membrane and the process is labeled transduction. Another bacterial cell has a genome, and the processes, transformation, and conjugation are labeled in the bacterial cells.

The processes involved in bringing about horizontal gene transfer in bacteria include transformation (cell-free, one-way DNA uptake), transduction (bacteriophage-mediated one-way genetic material transfer from one bacterium to the other), and conjugation (one-way genetic material transfer via physical association between two living bacterial cells)

The advancement in next-generation shotgun sequencing has expedited the production of large amounts of genome sequence data, revealing insight into genome evolution and speciation. The analyses of thousands of prokaryotic genome sequences have led to the understanding that the bacterial genome is primarily divided into the core genome comprising of the core genes and the flexible genome made up of accessory genes, collectively referred to as “pan-genome,” a term first coined by Tettelin and collaborators in 2005 (Tettelin et al. 2005). Generally, core genes encode essential metabolic activities, while accessory genes encode traits that provide fitness to bacteria to thrive under specific growth or environmental conditions. The magnitude of the flexible prokaryotic genome is astonishing. Only around 50% of the genome of any given strain is core. With hundreds of novel flexible genes contributed by each strain, the compounded flexible pool is enormous (Rodriguez-Valera et al. 2016). Acquisition of these accessory genes might be mediated by entities known as GIs, responsible for facilitating HGT. GI represents the cluster of horizontally acquired genes present in bacterial genomes that vary in dinucleotide frequency, GC content, codon usage pattern, etc., compared to the neighboring genes (Busby et al. 2013). While the core genome reflects the evolutionarily conserved character even under severe selection pressure, microbial genome dynamicity is achieved by recurrent gene acquisition and loss. In a microbial species, the plasticity of the genome results from HGT, which facilitates the acquisition of GIs and accelerates the rate of evolution. GIs make the genome flexible enough to adapt novel functions over a short life span.

Genome plasticity or genome flexibility is the phenomenon where the genome has large tracts of strain-specific polymorphic genes, known as the regions of genomic plasticity (RGPs), in different areas. It is critical for microbial survival in novel ecological niches, pathogenicity, symbiosis, and evolution of the genome (Mathee et al. 2008; Ogier et al. 2010). The key to having highly flexible genomes in bacteria is niche dynamicity. The more availability and proximity of exogenous genetic information result in the acquisition of novel information in the bacterial genome. The niche exerts selection pressure to retain only helpful information and optimizes genomes based upon the costs and benefits of different strains. The RGPs are broadly categorized into hypervariable regions that may arise from the deletions of specific DNA segments in different bacterial strains and mobile genetic elements (Ogier et al. 2010). The mobile genetic elements may get transferred from one place to another within the intracellular genome, or bacteria often use various HGT methods for efficient intercellular transfer. The enzymes transposases and recombinases (site-specific) are responsible for the mobility of mobile genetic elements and are coded by genes positioned on either the commonly shared core genome or the mobile genetic elements themselves.

5.2 GIs: Features, Types, Significance, and Plasticity

A GI is a fragment of exotic DNA that has been inserted into the bacterial genome and has clearly defined boundaries (Novick and Ram 2016). These transferable DNA regions are larger than 10–500 kb base-pair sequences (Osborn and Böltner 2002). Hacker et al. (1997) initially described GIs as gene clusters inside a bacterial genome possessing a specified dinucleotide frequency and GC content. GIs contribute significantly to genome flexibility, evolution, and environmental adaptation (Li and Wang 2021) by armoring bacteria with genes that provide antibiotic resistance, virulence traits and even sometime incorporating them with genes encoding enzymes leading to de novo metabolic pathway(s) formation (da Silva Filho et al. 2018).

5.2.1 Features Attributed to a GI

GIs encode a diverse range of accessory genes for improved fitness, secretion, pathogenicity, resistance potential, metabolic flexibility, ecological adaptability, symbiosis, etc., in the harboring bacteria (Darmon and Leach 2014). GIs have many other distinguishing features which delineate them from the other genomic regions. The specialized components of GIs are the presence of flexible sequences that differ from the core genome, the occurrence of genes for self-mobilization (viz. insertion sequences or ISs, integrases, and transposases), direct repeats (DRs) for flanking, and particular integration sites (Juhas et al. 2009; Schmidt and Hensel 2004). The recombination or transposition events during the integration process of GIs into the chromosome may generate the DRs. A generalized graphical representation of GI is given in Fig. 5.3. The evolution of GIs occurs through genome exchange, gene reduction, or further acquisition of some transposable elements (i.e., mobile genetic elements). The gene content between related strains of a species may vary considerably in each subset of islands. The G + C content (25–75%) of GIs is specific; they differ from the core chromosomal regions of bacteria (Schmidt and Hensel 2004). GIs are mostly inserted at 3′ end of genes coding for tRNA, and the region is often known as a hotspot for GI insertion (Lu and Leong 2016; Williams 2002). However, structurally every GI is almost the same in having a recombination module (integrase/excisionase module), two attachment sites (attR and attL, at the right and left end, respectively), and sometimes a recombination directionality factor (RDF). The tyrosine/serine recombinase enzyme family is involved in the exchange process during GIs integration (Boyd et al. 2009; Desvaux et al. 2020).

Fig. 5.3
A graphical illustration of the genomic island. It has core genomes at each end and a genomic island in between, which includes D R, I S, int, functional genes, I S, and D R.

A graphical representation of the generalized organization of genomic island integrated within the bacterial chromosome (GI). DR = direct repeats, IS = insertion sequence and int = gene coding for integrase. The functional gene content of GIs are used to classify them into some subtypes, such as (i) pathogenicity island (PAI) that encode genes essential for bacterial pathogenicity/virulence, (ii) resistance island (RI) that encode antimicrobial resistance genes, (iii) symbiosis island (SI), and (iv) metabolic island (MI) that encode for adaptive metabolic abilities

5.2.2 Types of GIs

The flexible gene content of GIs was classified and named based on the types of functions they encipher (Rainey and Oren 2011). These are (i) pathogenicity islands (PAIs) encoding genes relating to bacterial pathogenicity or virulence, (ii) symbiosis islands possessing similar structural attributes to pathogenicity islands though encoding proteins facilitating the mutual relationship of bacteria and multicellular organisms (Rainey and Oren 2011), (iii) antibiotic resistance islands that encode antimicrobial resistance genes, and (iv) metabolic/catabolic genomic islands enabling bacteria for adaptive metabolic abilities in degrading xenobiotic chemicals (Bertelli et al. 2019). Several kinds of mobile transposable elements including integrons, integrative and conjugative elements (ICEs), prophages, etc., are included within this wide dimensions of GI. GIs are mainly distinguished depending on the acquisition mechanism (i.e., transformation, conjugation, or transduction) and accompanying mobile elements (like transposases, ISs, and integrases) facilitating GI mobilization and transmission (Bertelli et al. 2019; Juhas et al. 2009; Langille et al. 2010; Soucy et al. 2015). According to Boyd et al. (2009), the GIs refer to a discrete class of evolutionarily ancient integrative components that are not “degenerate relics of prophages, episomes, integrons or ICEs.”

New GI insertion is followed by subsequent changes in the cellular or colonial morphology, function, or even the lifestyle of the accepting organism. Some GIs are PAIs of the strain Salmonella SPI1 and Listeria monocytogenes, symbiosis islands (SIs) of Bradyrhizobium, Mesorhizobium loti strain R7A, acid fitness islands (AFIs) of Escherichia coli, defense island (DIs) of Microcystis aeruginosa, Shewanella sp. strain ANA-3, resistance islands (RIs) of Haemophilus influenzae, Acinetobacter baumannii, saprophytic islands of some Escherichia coli strains, a phenol degrading ecological island of Pseudomonas putida, xanthan gum production island (a metabolic island) of Xanthomonas (Arashida et al. 2022; Chan et al. 2015; Hadjilouka et al. 2018; Lerminiaux et al. 2020; Lima et al. 2008; Makarova et al. 2011; Mates et al. 2007; Van Elsas et al. 2011; Xiang et al. 2021). Different ecological niches exhibit diverse selection pressure under which bacteria with similar GIs play different functions. A bacterium may contain a variety of GIs within its genome to perform various functions.

5.2.3 Significance of GIs

The evolutionary benefit of GI is that many genes (like the whole operon dealing with novel traits) can be transferred horizontally to the recipient’s genome, instigating significant modifications in the recipient’s characteristics. According to the theory of “selfish operon,” genes concerned with a particular function are grouped to promote their HGT (Lawrence and Roth 1996). GI can improve adaptability and competitiveness within the niche, thus providing selective benefits under certain growth conditions. The most significant evolutionary benefit of GI is fostering the genetic capacity and flexibility to transmit multiple genes, enabling more effective adaptation and enhancing fitness in particular ecological niches (Dobrindt et al. 2004).

5.2.4 Plasticity in GIs

Bacteria harbor IS elements, plasmids, prophages, transposons, GIs, ICEs as MGEs. The ISs are the simplest MGEs (< 2.5 kb in size), which transmit no genes except those that encode machinery essential for their insertion at various DNA sites (Siguier et al. 2014). Plasmids have self-replication ability, and their intercellular transfer involves conjugation in prokaryotic cells (Smillie et al. 2010). Prophages integrate with bacterial chromosomes, and their mobility depends on transduction (Brüssow et al. 2004). Transposons may transfer from one intra-genomic region to another intra-genomic region and do not undergo any HGT. GIs are larger (10–500 kb) MGEs with genes for self-mobility and other essential genes for strain-specific functions. The staphylococcal pathogenicity islands (SaPIs), GTAs, and ICEs are the three most common types of GIs. The first two are presumably descended from prophage forebears and have retained crucial prophage architectural traits. The third group likely originated from conjugative plasmids, which acquired additional characteristics and transformed into mosaics. While GTAs and ICEs independently influence HGT, the SaPIs depend on certain bacteriophages. The ICEs principally transmit their own DNA, whereas the GTAs solely convey the unlinked host DNA, but the SaPIs are a blend of both ICE and GTA. It is assumed that immobile GIs are variations of mobile ones (Novick and Ram 2016).

Not all GIs have genes for autonomous transfer. For example, SaPIs need helper phages for mobility (Lindsay et al. 1998). GIs share discrete segments of DNA between similar strains. However, their formation and ability to acquire accessory genes in the syntenic block lead to bacterial adaptation, genome diversification, and evolution. Different molecular events like recombination, deletion, duplication, inversion, etc., make GIs flexible.

The SaPIs of Gram-positive bacteria have phage integrase and excisionase homologs. The functional relatedness of SaPIs with phage makes them highly mobile with a unique lifestyle. The helper phage-mediated transfer of SaPIbov5 involves a prophage cos site in it. Due to phage interference SaPIs increase the transfer of chromosomal adaptive genes for host virulence. Genomic data from Staphylococcus implies SaPIs have typical phage-related genome organization but are sharply different from their progenitor prophage. Phage-related elements of streptococci and lactococci have orthology patterns similar to the SaPIs. Genome-based analysis also suggests that the widespread and diversified nature of SaPIs all over the bacterial genomic world successfully builds evolutionary strategy (Chen et al. 2015; Dokland 2019; Novick et al. 2010; Novick and Ram 2016).

The ICEs catalyze self-excision through site-specific recombination regardless of conjugation and integration. ICEs have diverse modular structures, i.e., gene clusters with different functions like conjugation, integration or excision, and adaptation. Conjugation modules have two types: a single-stranded DNA-based MOB/MPF module where MOB is the relaxase protein family, and MPF is a mating pair formation protein family. The second is a double-stranded DNA-based module that encodes SpoIIIE/FtsK protein family DNA translocator (Tra proteins) (Álvarez-Rodríguez et al. 2020; Besprozvannaya et al. 2013; Guglielmini et al. 2011). A double-stranded conjugation module containing ICE, regarded as actinomycete ICE (AICE), is found in actinomycetes (Johnson and Grossman 2015; Te Poele et al. 2008). The MOB acts on the 5′ end of ICEs and initiates DNA transfer through the MPF into the recipient bacterial cell. On the other hand, Tra proteins form channels and act on the cis-acting locus of circular AICE to transfer. The integration/excision modules encode enzymes for recombination events, generally, tyrosine/serine recombinase (or DDE transposase) those identify repetitive flanking sequences of ICEs. ICEs containing such integration/excision modules are usually found at the 3′ end of genes coding for tRNA. In case of different housekeeping genes, they can be found either at the 3′ or 5′ end. ICEs Ecoc54N from Escherichia coli, Tn5397 from Clostridium difficile, Tn1806 from Streptococcus pneumoniae, Tn6012 and ICE6013 from Staphylococcus aureus, and TnGBS elements from Streptococcus agalactiae have integration/excision modules with either tyrosine/serine recombinase or DDE transposase (Antonenka et al. 2006; Bellanger et al. 2014; Camilli et al. 2011; Guérillot et al. 2013; Mingoia et al. 2016; Sansevere et al. 2017; Ternan et al. 2012; Wang et al. 2006).

The fundamental mechanisms behind GIs plasticity and evolution are acquiring, exchanging, or deleting different modules. Comparisons of sequences between ICEs provide information about the occurrence of enormous exchanges within different modules. It is suggested that the host specificity of ICEs evolved through the site-specific recombination between conjugation and integration modules of various ICEs (Burrus et al. 2002). Module exchanges also occur between other GIs. Sequence comparisons also reveal many GIs developed by deletion mutations in mobility modules. In some GIs, for example, ICE_2603_tRNALys from S. agalactiae, genes (like orfD) associated with intercellular or intracellular mobility are deleted by incorporating the copy of insertion elements (IS1193). The combination of a GI within another GI, followed by subsequent restructuring, resulted in the acquisition of novel modules. Insertion of non-mobilizable ISs, induce deletion or inversion of neighbor sequences besides inactivation. In the high-pathogenicity island (HPI) of Y. pestis KIM, ISs are connected with conjugation module deletion (Bellanger et al. 2014; Chen et al. 2010; Puymège et al. 2013).

5.2.4.1 Driving Forces for GIs Plasticity

The GIs vary between individual strains and help adapt to new ecological niches, host-cell interaction, and virulence. This strain-specific propensity to have high island variability contributes to genome plasticity and subsequent genome evolution that might lead to niche specialization of specific bacterial strains. The variable strain-specific genomic DNA segments may evolve through rearrangement events like recombination, deletion, insertion, duplication, amplification, inversion, or tandem accretion.

5.2.4.1.1 Recombination

This is a crucial mechanism to keep genome flexibility and fitness. Homologous and non-homologous recombination are genetic information exchange procedures between DNA sequences with higher or lower identity, respectively (Didelot et al. 2012). In homologous recombination, the recombination rate reduces if the sequences are poorly identical. Non-homologous recombination occurs during either DNA synthesis or strand breakage and causes the addition of new genetic material through HGT and site-directed recombination. The insertions of new DNA segments often lead to deletions and subsequent hairpin formation due to strand slippage. The new DNA segments introduce genome diversity and synteny loss in bacteria. In homologous recombination, the involvement of Rec enzymes results in the single-strand or double-strand DNA gap repair; therefore, the recombination rates increase. The Rec enzymes, on the other hand, negatively regulate non-homologous recombination, lowering the recombination rate. Site-directed recombination involves tyrosine/serine recombinase or DDE transposase to recognize the flanking repeats causing the exchange of DNA segments between integration/excision modules of different GIs. Both types of recombination play indispensable roles in bacterial genome evolution. The plasticity of GIs may come through the recombination process between two separate GIs, as shown in Fig. 5.4. A transposon Tn6022 mediated recombination is indicated as the source of GI formation in many Acinetobacter baumannii strains (Patel 2016; Peters et al. 2014). From the ecological perspective, the overlapping niches of strains bring more opportunities to exchange genetic information between GI modules than lineages living in distant niches.

Fig. 5.4
An illustration depicts G I plasticity through recombination. The vertical oval shape on the left illustrates a plasmid and bacterial cell genome. A cross mark is depicted to represent the recombination between integration or excision modules. The vertical oval shape on the right depicts the genome with the new G I.

GI plasticity via recombination. The plasticity of GIs may come through the recombination process between two separate GIs. Here, the red cross mark in the left diagram depicts the recombination between the integration/excision modules of GI1 and GI2. This event leads to a new GI formation, as shown on the right side of the figure

5.2.4.1.2 Deletion

Inactivation or deletion of part of the genome in bacteria facilitates genome evolution. While acquiring novel genes enhances the bacterial colonization potential, gene loss enables niche specialization. The “Streamlining” and “Black Queen” hypotheses assume that the loss of superfluous genes confers the survival ability or the fitness cost in bacteria lowering the metabolic burden (Giovannoni et al. 2005; Morris et al. 2012). The “Streamlining” theory is dependent on the fact that the challenging environmental circumstances favor robust selection to minimize cell complexity. On the other hand, the “Black Queen” hypothesis specifies that an organism should stop performing its costly function under certain conditions. Both hypotheses support reductive genomic evolution. These trends are visualized when host-specialized, highly virulent bacterial pathogens evolve from a vast range of hosts through massive gene loss. In endosymbionts, genome shrinkage is essential, which occurs possibly by deletion. Many new flagellar genes have been identified in the flagellar operon (fli operon) of Salmonella enterica var. typhimurium LT2. Deletions in the 2.07Mbp region in fli operon resulted in increased fitness (Frye et al. 2006; Koskiniemi et al. 2012). The plasticity and evolution of GIs also depend on the deletion of different modules. GIs are often sensitive to ISs inclusion leading to the inactivation or loss of superfluous genes, thus lowering the energy/mass expenditure.

5.2.4.1.3 Insertion

Insertion is the process of new gene acquisition that induces adaptive alterations in genome architectures and confers bacteria the ability to survive in new ecological niches. The transposition ability of ISs helps them to integrate anywhere in the genome. Their lodging is accidental in the genome, and several mechanisms control their disrupting activity resulting in genome innovation. The introduction of ISs in the genome creates an opportunity to evolve and accommodate better adaptive ability to the stressors. In the genome of nosocomial isolates of Staphylococcus epidermidis, insertion element IS256 facilitates the tolerance to drugs making them multi-drug resistant. The insertion occurs in the GIs, consequently acquiring novel gene clusters (Dengler Haunreiter et al. 2019; Espadinha et al. 2019; Otto 2009). Some GIs may integrate within other GIs, followed by subsequent reorganization. Several mechanisms are responsible for the evolution of such mobilizable GIs. The non-mobilizable GIs that do not encode self-conjugative machinery can insert within ICEs. In the genome of Mesorhizobium loti USDA110, about 64 ISs are found within the 680 kb ICEMlSymR7A (Kaneko et al. 2002; Sullivan et al. 2013). Varieties of transposable elements have also been recognized in many ICEs or related GIs. Some ICEs also bear prophages. For example, Tn6164 from C. difficile bears a complete prophage (Hargreaves et al. 2016). A graphical representation of deletion and insertion in GI is shown in Fig. 5.5.

Fig. 5.5
An illustration of two bacterial genomes with their insertion sequences. A cross is marked in the first bacteria between the genome and insertion sequence. Deletion and insertion are illustrated along with a legend.

Schematic presentation of deletion and insertion in GI. Insertion of region 2 from GIA into the GIB and region A from GIB into the GIA leads to the loss and gain of regions islands (Melnyk et al. 2019)

5.2.4.1.4 Duplication and Inversion

Duplications are dynamic forces that aid in surviving in an unfriendly environment. Multiple amplification of the genomic regions results in tandem repetitive sequences. Inversion is another process when a DNA fragment is excised and reconnected in the opposite direction elsewhere in the genome. In general, the inversion sequences are bordered by inverted repetitions and are occasionally found inside a coding region. This reversal modifies the gene expression profile and alters the phenotypic characteristics of bacterial species. The evolution of distinct bacterial lineages confers adaptability by coupled gene duplication-amplification in response to drugs. It has been found that multistep adaptive development is preceded by gene amplification. As a result, mutation occurs in the additional copies while stabilizing the other copies of essential genes, enhancing fitness. Both duplication and inversion frequently target GIs, leading to plasticity. A schematic presentation of deletion and insertion in GI is depicted in Fig. 5.6. E. coli ST58 contains two GIs, PAI-1 and PAI-2, both islands sharing a great deal of genetic information. A duplication event at two distinct tRNA–Phe–GAA sites led to the development of these two progenitor islands from a single island. This duplication event was followed by inversion, and PAI-1 and PAI-2 acquired separate sets of genes throughout time (Wyrsch et al. 2020). IS-mediated duplication has been found in the symbiosis island of WN105 mutant of Bradyrhizobium diazoefficiens USDA110 (Arashida et al. 2022).

Fig. 5.6
An illustration of two bacterial genomes depicting duplication and inversion. A duplicated and inverted region 1 is included in the series, within the genome on the right. A legend is depicted on the right.

Duplication and inversion in GI. Left diagram shows normal GI, whereas right shows GI with duplicated and inverted region 1 (Wyrsch et al. 2020)

5.2.4.1.5 Tandem Accretion

GIs also originate from site-specific recombination and subsequent tandem accretion-deletion of CIMEs and ICEs (Fig. 5.7). The site-specific accretion resulting from gene gain and loss is a key tool for GI flexibility and evolution. The composite structure of ICESt1 and related GIs from Streptococcus thermophilus demonstrated these components developed via site-specific recombinations and deletions. At the 3′ end of the fda locus in seven distinct strains of S. thermophilus, four forms of ICESt1-related ICEs with comparable conjugation and recombination modules were identified. These elements are flanked by different site-specific attachment sites (att) that are strongly connected to attachment sites of two distinct cis-mobilizable elements (CIMEs) CIME19258 and CIME302. This results in site-specific recombination, which results in the excision of ICEs and the subsequent incorporation of CIMEs at the 3′ end of fda. Moreover, genome analysis showed identically shortened sequences at the att regions of these ICEs and CIMEs (Bellanger et al. 2014; Pavlovic et al. 2004).

Fig. 5.7
An illustration of two bacterial genomes depicting tandem accretion of G Is along with a list of legends to the right. An I C E with a t t m and a t t n strung on it is separate from the chromosome in the genome on the left. The same is integrated into the chromosome in the genome on the right.

Tandem accretion of GIs. The diagram shows the integration of ICE adjacent to the att1 of the resident CIME by site-specific recombination, causing composite GI formation (Bellanger et al. 2014)

5.3 GIs and Bacterial Evolution

The idea of prokaryotic species is complicated, and it is commonly assumed that such species are formed because of continuous processes combining gene loss and gain facilitated by HGT (Lawrence 2001). The acquisition and loss of auxiliary genes within the GI and the probable transmission of chromosomal DNA from the host may be a precondition for bacterial species evolution. Since GIs may integrate themselves into the host chromosomes by excision, conjugated-mediated self-transfer into a new host, and reintegration, they can transmit a piece of the host genome into the recipient bacteria. This process of receiving donor (foreign) DNA may open the way for bacterial evolution when donor DNA is integrated into the genome of the host via transformation. The unique self-transfer type IV secretion system (T4SS) of Neisseria gonorrhoeae (encoded by a horizontally acquired huge gonococcal genetic island or GGI), enables both secretion and spread of the host chromosomal DNA. Later, when taken up by the transformation process, the secreted chromosomal DNA (via GGI-encoded type T4SS) can undergo recombination along with the host’s chromosome, adding to antigenic variation and drug resistance. Thus, GIs appear to influence the evolution of the host bacteria. This change occurs in several Gram-negative or Gram-positive, environmental or pathogenic bacteria like N. gonorrhoeae, Acinetobacter sp. ADP1, Haemophilus influenza, Pseudomonas stutzeri, Bacillus subtilis, Streptococcus pneumoniae, and Ralstonia solanacearum. In addition to transformation, the transmission of GIs across bacterial species can also be mediated through conjugation and bacteriophages. The SaPIs are the most well-known example of this phenomenon. There have been several reports of such bacteriophage-mediated transfers of GIs, that includes the HPI and the GIs of Yersinia pseudotuberculosis and Prochlorococcus sp. (marine cyanobacteria), respectively (Juhas et al. 2009).

5.3.1 GIs in Bacterial Genome Evolution and Shapiro’s Geographical Metaphors

In his article “How clonal are bacteria over time?,” Shapiro (2016) has convincingly employed different geographical metaphors to explain how GIs are potentially associated with genome evolution. According to Shapiro, horizontal transfer (recombination) rates fluctuate significantly across the genome so that an entire population, except for a few loci, can be clonal. These loci are referred to as GIs. The term “peninsula” provides a simile that might better depict the connection of the islands to the microbial genomes. An island remains evolutionarily independent compared to the mainland genome, but their fates may become associated. For instance, a bacterium may obtain any gene from the enormous microbial gene pool. The acquired gene enables the bacteria to colonize into a new ecological niche, initiating a clonal expansion where the acquired gene’s fate and its new host genome are inextricably intertwined, at least during the clonal expansion period. Certain microbial genomes may include an inordinate number of islands to the point where there is no mainland but a large archipelago. Archipelagos are not always stable throughout time, and they can occasionally combine into continents. When the ecological conditions are favorable, any genome from the panmictic gene pool can break free from the “gravitational pull” of recombination and embark on clonal growth (Shapiro 2016).

With the help of bioinformatics, it became apparent that novel genes with unknown functions are present within GIs. These novel genes lack orthologs in other bacterial species and might furnish the host bacterium with various adaptive functions. Different accessory functions like additional metabolic pathways, resistance toward antibiotics and harmful drugs, pathogenesis, symbiosis, or traits involved in increasing microbial fitness, are encoded by GIs.

Although plasticity (Dobrindt et al. 2004) within GI alters bacterial lifestyle or behavior, it is not unfamiliar that bacterial “quantum leaps” in context with evolution is mediated by GI (Juhas et al. 2009), as it has the intrinsic property to transfer a large set of genes and integrate as a whole into the recipient genome, thereby promoting bacterial diversity and speciation (Guo et al. 2012). Some of the GI-mediated features that can offer a significant selective benefit to the host bacteria and thus leap a step toward evolution and speciation are discussed briefly below.

5.3.2 GI and Heavy Metal Tolerance

GIs may provide an advantage for living in harsh toxic environments leading to the evolution of resistant strains. One of the most critical toxic environments includes acid mine drainage (AMD) which contains high concentrations of toxic heavy metals (including arsenic). A comparative genomic study on different isolates of Thiomonas bacteria obtained from AMD of Carnoulès, France, revealed the presence of more than 20 GIs. The findings also indicated that arsenic-associated GIs had evolved differently in two closely related Thiomonas strains, resulting in varied survival capabilities in As-rich environments (Freel et al. 2015).

GIs contributing to heavy metal tolerance in Mucilaginibacter spp. have been observed in isolates obtained from gold/copper mines. Clusters of genes that may be connected with mobile genetic elements were discovered by analyzing the location of heavy metal resistance determinants. These loci contained genes for tyrosine recombinases (integrases) and subunits of T4SS, letting integration/excision and conjugative transfer of many GIs, respectively. The supposed presence of many CTnDOT-related GIs in the genomes of Mucilaginibacter may have a crucial task in genome evolution and subsequent adaptation (Vásquez-Ponce et al. 2018).

In Cupriavidus metallidurans, GI variation has contributed to the variable plasticity of the genome. C. metallidurans represents a versatile multi-metal resistant bacteria. Comparative genomic hybridization of sixteen C. metallidurans strains revealed that the broad arsenal of heavy metal resistance factors was well preserved across all strains of C. metallidurans. Contrarily, the transposable elements found in strain CH34 were not observed in the other strains but displayed an indirect pattern associated with a specific geographical location or biotope. One set of strains had nearly all transposable components, whereas the second group had a substantially lower proportion. This was also manifested in their capacity to break down toluene and thrive on carbon dioxide and hydrogen gas in an autotrophic manner. Both of these are connected to distinct GIs of the Tn4371 family (Van Houdt et al. 2012).

5.3.3 GIs in Secondary Metabolism, Pathway Evolution and Xenobiotic Degradation

GIs have been found in a distinct environmental (marine) bacteria Salinispora, belonging to the phylum Actinobacteria. Genomic comparison of S. tropica with S. arenicola displayed the distribution of three-quarters of species-specific genes within 21 GIs associated with the production of secondary metabolites, also establishing a connection between secondary metabolism and functional adaptation. All species-specific biosynthetic pathways are found in GIs, most of which are found in S. arenicola, contributing to its worldwide distribution in different habitats. Gene duplication and acquisition dominate genome evolution, which provides rapid chances for generating novel bioactive compounds in the case of secondary metabolism. The horizontal sharing of secondary metabolic pathways performs a major functional role in acquiring natural product biosynthetic gene clusters, which also serve as the driving force for maintaining bacterial diversity (Penn et al. 2009). GIs have been identified as hotspots for biosynthetic gene cluster acquisition in Salinispora (Letzel et al. 2017). Using something apparently like a plug-and-play paradigm of evolution, clusters of acquired biosynthetic genes are targeted to certain GI and can replace each other (Letzel et al. 2017).

GIs seem to have a significant role in developing novel pathways through “patchwork assembly” (a novel combination of previously existing pathways) (Dobrindt et al. 2004; Guzman and Harris 2015; Mingoia et al. 2016). Studies reveal that plasmids, transposons, and GIs include catabolic genes that encode functionalities with the capacity to digest xenobiotic substances. For instance, Ralstonia oxalatica Tn437, a GI, might break down chlorobiphenyl. First discovered in Pseudomonas knackmussii B13, the clc element could utilize chloroaromatic chemicals as a carbon source. In the field of biodegradation, ICEclc is the best-known ICE. It bears selective genes for the ortho-cleavage of chlorocatechols and aminophenol metabolism, and these are clc and amn genes respectively). This component is capable of metabolizing 3-chlorocatechol, 3-chlorobenzoate, 4-chlorocatechol, as well as aminophenol. Due to its self-transfer capabilities, it can insert itself into the genomes of different Proteobacteria based on environmental circumstances (Klockgether et al. 2006). Based on amino acid homology searches, similar clc-like elements from bacterial genomes have already been isolated around the world, including in Xylella fastidiosa 9a5c (a plant pathogen), Pseudomonas aeruginosa C (a clinical isolate), P. aeruginosa S17GM (an environmental isolate) and Xanthomonas campestris (Lacour et al. 2006). Pseudomonas-like clc components have also been identified in a Ralstonia sp. JS705 isolate (reported from contaminated groundwater). This clc element codes for chlorobenzene to chlorocatechol metabolizing enzymes with 85–100% nucleotide similarity in the conserved area (Klockgether et al. 2006). These discoveries demonstrate that the clc element can spread to new environments and obtain new functionalities within its current location (van der Meer and Sentchilo 2003). GIs conferring the capacity to degrade xenobiotics and toxic compounds biologically has been detected in many bacterial populations. Some examples of GI with biodegradation functions are listed in Table 5.1.

Table 5.1 List of GIs with biodegradation potential for different xenobiotics

5.3.4 GIs and Siderophore Expressing Bacteria

Bacteria are well known for expressing iron uptake systems, known as siderophores. Siderophores represent low molecular weight secondary metabolites synthesized and then released into their environment, where they chelate ferric iron to combat iron deficiency (Neilands 1995; Thode et al. 2018). This is an adaptation to survive in an iron-restricted environment and is associated with virulence. Genes encoding for siderophores are distributed in several pathogenic and non-pathogenic bacterial species harboring GIs. Examples include HPI in Yersinia sp.; SHI-2, SRL, and SHI-3 in various species of the genus Shigella; and PPI-1 in S. pneumoniae (Dobrindt et al. 2004). Such GIs can serve as fitness islands in environmental bacteria or PAI in pathogenic bacteria.

5.3.5 GIs and Bacterial Secretion Systems

The evolution of pathogenicity in bacteria through acquiring GIs (virulence-carrying genes) is a well-established phenomenon. During evolution, bacteria may have gained new genes by HGT, or their current genes may have acquired mutation. One fine example of the successful host-pathogen interaction is represented by several classes of protein secretory systems encoded by GIs (Martínez 2013). The type III secretion systems (T3SS) or “contact-dependent” secretion systems are complex multiprotein machinery (Scherer and Miller 2001). T3SS is generally expressed by pathogenic bacteria infecting plants and animals, including genera like Yersinia, Shigella, Salmonella, Pseudomonas, enteropathogenic E. coli (EPEC), Erwinia and Rhizobium. In some exceptional cases, two T3SSs are expressed within a single pathogen, each necessary at a different infection stage. In S. enterica, out of the two T3SS (harbored by SPI-1 and 2), one is essential for the initial interaction and penetration into the eukaryotic target cell (intestinal epithelium cells). At the same time, the other is essential for systemic infection (Juhas et al. 2009).

GIs of many bacterial pathogens encode T4SSs, translocating bacterial effector proteins through the bacterial membrane and plasma membrane into eukaryotic host cells. T4SSs, in turn, mediate HGT, which contributes to plasticity of the genome, development of infectious diseases, and the spread of drug resistance and other attributes related to virulence. The architecture of the genetic determinants of T4SS is diverse and comprises numerous genes grouped as a single functional unit (Juhas et al. 2007). The T4SS has been extensively studied in Agrobacterium tumefaciens. Unlike the T3SS system, this complex system is unique as it delivers nucleoprotein complexes and effector proteins into plant cells, contributing to pathogenicity directly (Dobrindt et al. 2004).

5.3.6 GIs and Antimicrobial Resistance

One of the most significant routes for acquiring drug resistance is GIs. The advent of methicillin-resistant Staphylococcus aureus (MRSA) has largely been attributed to the so-called staphylococcal cassette chromosome methicillin-resistant (SCCmec) islands present in the genome of S. aureus. This island can also integrate with other MGEs and might confer resistance against additional antibiotics, thus representing a hotspot with variable size (20 kb to ≥60 kb) (Dobrindt et al. 2004). MRSA is resistant to various antibiotics like methicillin, penicillins, kanamycin, tobramycin, bleomycin, tetracycline, macrolide, lincosamide, streptogramin, vancomycin and also to heavy metals (Juhas et al. 2009). Origin of the SSCmec island in S. aureus is yet to be established; however, comparative bioinformatic studies have proposed that it could have originated from other staphylococcal species via HGT, such as S. sciuri, S. fleuretti, S. epidermidis, or S. haemolyticus. Reports have shown the existence of SSCmec in S. epidermidis well before its discovery from S. aureus. Thus, SCCmec in S. epidermidis might act like a pool of resistance genes contributing to the evolution of multi-drug-resistant S. aureus. In another case, mecA was naturally found to be present in the chromosome of S. fleuretti and therefore was supposed to be the original source of the mecA in the SCCmec. Many dynamic SCCmec islands have been discovered from S. haemolyticus genome, thus indicating S. haemolyticus to be a potent carrier for methicillin-resistant genes (Juhas 2019).

The genus Enterococcus has become a chief cause of nosocomial infections, and the prime player in the swift expansion of such enterococcal infection comprises those of drug-resistant strains. Besides genomic modification and HGT, GIs also play a crucial role in the acquisition of drug resistance. In studies where the whole genome sequences of some E. faecium and E. faecalis (carrying many resistance genes) were screened to analyze the correlation between antibiotic resistance genes (ARGs) and GI transmission, two observations became distinct, firstly the prevalent nature of GIs in Enterococcus, and secondly, antibiotic-resistant genomic islands (ARGIs) contributing significantly to the dissemination of some ARGs. The above study has clearly shown the existence of 119 GIs in 37 strains, with an average value of 3.2 in each strain (universal presence of GI in Enterococcus). GIs in these strains was found to harbor variant ARGs, including aminoglycosides, chloramphenicol, glycopeptides/peptides, lincosamides, streptomycin and multi-resistant efflux pumps. The ARGs identified in the enterococcal ARGIs are, mdtG (encodes an efflux pump providing resistance against with fosfomycin), tetM (tetracycline resistance), dfrG (diaminopyrimidine antibiotic resistance), lnuG (lincosamide resistance), fexA, (an efflux pump providing chloramphenicol resistance). Besides encoding for drug-resistant, some of the GIs have been credited with mobility-related elements, like genes for conjugation, transposase or excisionase. Credible relationships among enterococcal strains and GIs were found to indicate frequent genetic exchanges within and between Enterococcus strains. Regular genetic exchanges among all the strains (comprising E. faecium and E. faecalis) mediated by GIs were not unusual (Li and Wang 2021). The high plasticity of Enterococcus genome has been allocated to the conjoint action of HGT, and either gain or loss of genetic information. According to Darwin, for the evolution of an organism, environmental selection pressure must have served as the driving force. This hypothesis fits well in the case of the MDR Enterococcus, isolated from complex ecological niches, like hospitals, medical clinics, farmlands, contaminated water, stools, humans, and pigs, and where harsh environmental factors always exist (such as antimicrobial compounds, organic and inorganic biocides, and heavy metals). The GIs harboring numerous novel genes due to their self-mobility or non-mobility might have integrated into the bacterial genome. Consequently, the recipient organism develops a new metabolic potential to enhance fitness or adaptability (Li and Wang 2021).

Many Salmonella enterica serovars responsible for gastrointestinal sickness are resistant to antibiotics due to GIs bearing a class 1 integron that contains the resistance genes. Studies have suggested that Salmonella genomic island 1 (SGI-1) retains a complex multi-drug resistance segment, imparting resistance against many antibiotics, including tetracycline, ampicillin, chloramphenicol/florfenicol, sulfamethoxazole and streptomycin/spectinomycin. The SGI1-associated MDR region comprises a complex integron harboring the aadA2, floR, blaPSE, tetR, and tetG genes (Vo et al. 2010). Many Salmonella serovars and Proteus mirablis possess SGI1 or similar islands harboring diverse resistance gene sets. SGI1 is a mobilizable integrative element transferable experimentally into E. coli (Hall 2010).

Cholera, caused by Vibrio cholerae, is a dreadful disease. Reports of multidrug-resistant V. cholerae strains have been frequent over the past few decades. The spread of determinants related to resistance is primarily due to mobile genetic elements such as the SXT / R391 integrated conjugate element, IncC plasmid, and GI. Transmission of the IncC plasmid is activated by the master activator AcaCD (Rivard et al. 2020). The regulatory network of AcaCD extends to the chromosomally integrated GIs. A discrete and novel mobile genomic island (MGI) MGIVchHai6 integrated into the chromosome of a multidrug-resistant V. cholerae HC-36A1 isolate (Carraro et al. 2016) contains an integron In104-like multi-drug resistance element and a mercury resistance transposon, analogous to SGI1. Acquisition of MGIVchHai6 plays a vital role in resistance against β-lactams, chloramphenicol, trimethoprim, tetracycline, sulfamethoxazole, and streptomycin/spectinomycin (Carraro et al. 2016).

Pseudomonas aeruginosa is a severe threat to burn patients and the immune-compromised. Different high-risk clonal strains, like ST111, ST175, and ST235 carry genes that give resistance to β-lactam antibiotics (Roy Chowdhury et al. 2016). GIs play a determining role in the spread of resistance to a wide variety of effective antibiotics, such as metallo-β-lactams and extended-spectrum β-lactams. Strains of P. aeruginosa (ST) 235 carry Tn6162 and Tn6163 in GI1 and GI2, respectively. The class 1 integron coupled with Tn6163 in GI2 carries a blaGES-5–aacA4–gcuE15–aphA15 cassette range that confers resistance to aminoglycosides, including carbapenems. Studies suggest that the evolution of GI2 could have occurred from a novel ICE. GI2 is winged by a repeat motif region (direct) of 12 nucleotide bases and codes for integration, conjugative transfer and ICE-specific proteins (Roy Chowdhury et al. 2016).

Another Gram-negative opportunistic pathogen, Acinetobacter baumannii, is a nosocomial pathogen that causes serious health hazards to immunocompromised patients. Nowadays, A. baumannii has been garnering considerable attention widely due to its rapid capacity to build up multi-drug resistance. The sequences of many A. baumannii genomes have divulged a vast collection of ARGs, several of which are connected with transposable elements and ISs, and it might be found in GIs, known as AbaR (Leal et al. 2020; Liu et al. 2014). Different AbaR islands have been found that vary in size, and are dynamically reshaped primarily because of recombinases, transposases, and integrases (Leal et al. 2020). Few resistance genes are present within plasmid, which can be intra- and interspecies exchanged, even by prophages (Leal et al. 2020). Studies have detected the presence of a novel GI (GIBJ4) in the drug-sensitive strain BJ4 possessing metal resistance genes inserted into the position where AbaR-like RIs commonly reside in other strains of A. baumannii (Liu et al. 2014). In A. baumannii several antibiotic resistance determinants are also present outside the RIs, such as integrons, chromosomal intrinsic antibiotic resistance genes, and the blaOXA-23-containing transposon Tn2009 (Liu et al. 2014).

5.3.7 In Planta GI Mediated Bacterial Evolution

GIs can transmit across bacteria in vitro but not during the infection process in the host. Lovell et al. (2009) have demonstrated that horizontal transmission of a GI (PPHGI-1) occurs in planta between strains of the plant pathogen Pseudomonas syringae pv. Phaseolicola (Pph). This study reveals that the transfer of PPHGI-1 across Pph strains by transformation involves four unique steps: (i) excision of the GI from the bacterial chromosome, (ii) release of the circular episome from the bacterium, (iii) relocation into competent bacterial cells, and (iv) integration at a particular att site. Transformation, the simplest method of DNA exchange, may thus accomplish the evolution of bacterial pathogens via HGT (Lovell et al. 2009).

5.3.8 GIs in Evolution of Pathogenic Bacteria

Many bacteria incorporate PAIs, a subset of GI, in their chromosomes. PAIs are specialized islands comprising arrays of genes whose expression leads to pathogenicity (virulence) and disease. These PAIs provide fitness to the PAI-positive bacteria directly or indirectly by increasing their chances of survival in vivo and/or transmission to new hosts and contributing to genome evolution. This PAI-mediated fitness can be well observed during the onset of clinical symptoms, which is directly linked to the pathogenicity or lesions triggered by the pathogenic or virulent bacteria (Hacker and Carniel 2001). PAIs are the most well-known GI which has been most exhaustively studied. The various determinants of pathogenic bacteria responsible for the pathogenesis or disease are embedded within these PAIs and also on various mobile elements like extra-chromosomal plasmids, phages, insertion elements, and transposons (Schmidt and Hensel 2004). PAIs are ubiquitous in both Gram-positive and Gram-negative pathogens. Some examples of Gram-negative bacteria harboring PAIs include Salmonella spp., Neisseria spp., Shigella spp., Yersinia spp., Helicobacter pylori, E. coli, Pseudomonas spp., Vibrio cholerae, Porphyromonas gingivalis, and Francisella spp. Examples of Gram-positive pathogenic bacteria having PAIs include Staphylococcus aureus, Streptococcus spp., Listeria spp., Clostridium spp., and Enterococcus spp. The presence of a certain PAI, on the whole, is specific to a pathogenic bacterium or a specific strain of bacteria. A particular bacteria can also have more than one PAI in its genome (Gal-Mor and Finlay 2006). Bacterial species or strains equipped with PAI have an inherent advantage over their non-PAI-bearing counterparts when it comes to pathogenicity. Most virulence determinants for the typical Salmonella enterica are present both in the chromosome and within PAI, termed as Salmonella pathogenicity islands (SPIs). Both SPI-1 and SPI-2 are essential in determining virulence. SPI-1 encoded T3SS proteins (Shea et al. 1996) build up complex machinery for the translocation of various effector proteins from the extracellular S. enterica into the host (eukaryotes) cells (Schmidt and Hensel 2004). SPI-1 also codes for various regulators, some acting as transcriptional activators and others as inhibitors or repressors of SPI-1 genes. The most crucial are HilA, HilE, and LeuO; these gene products and many more work in a complex way to tightly regulate SPI-1 genes (Lou et al. 2019). HPI, a discrete PAI, is naturally found in all the virulent serotypes of Yersinia sp., namely Y. pestis, Y. enterocolitica and Y. pseudotuberculosis but is completely lacking in serotypes with low virulence. Due to the high instability of HPI, it has also been established in various other members of enterobacteria (Carniel et al. 1996). Clostridium difficile, an aerobic bacterium, produces various toxins leading to diarrhea and pseudo-membranous colitis. However, toxin production is restricted only to the virulent (toxigenic) variant but is lacking in the non-virulent type. Comparing both variants revealed that the genes coding for the toxins are integrated into a PAI termed PaLoc (pathogenicity locus) (Braun et al. 1996). The first case of vancomycin resistance in bacterial pathogen was described from a clinical isolate of E. faecalis. Genes related to virulence determinants are incorporated in a 154 kb PAI. It has been found that commensal E. faecalis got transformed into virulent ones by acquiring GI-encoding virulence factors from the virulent E. faecalis (Juhas et al. 2009; Shankar et al. 2002). Members of the genus Pseudomonas have been discovered from multiple habitats, and some species are even considered opportunistic pathogens. Genomic analysis has shown the presence of different types of PAI in strains of P. aeruginosa, termed PAGI (viz., PAGI-1, 2, 3), thereby conferring various adaptive traits (Battle et al. 2009). PAGI-1 (from strain PAO1 and patients with urinary infection) was found to contain genes coding for numerous dehydrogenases (with unknown potentials) and proteins able to sense redox-cycling agents, thus, indicating a protective role of this island against reactive oxygen species (ROS) damage. The latter two, PAGI-2 and PAGI-3, were discovered from the type C strain of P. aeruginosa (isolated from cystic fibrosis patients) and strain SG17M (aquatic strain), respectively. The island PAGI-2 contains numerous genes encoding transporters, regulators, and proteins needed for biosynthetic pathways. The crucial one seems to be the proteins involved in the biogenesis of cytochrome C, thus providing a selective advantage to the bacteria to thrive in an environment with oxidative stress and deprived of iron. Cytochrome C-mediated iron uptake and inactivation of free radicals appeared to be the player behind the scenes. Furthermore, PAGI-3 was found to be a metabolic island without any virulence factors (Schmidt and Hensel 2004). Six more novel islands were identified through a subtractive hybridization approach from P. aeruginosa clinical isolates (Battle et al. 2009).

In conclusion, it may be stated that GIs are still an enigma. The evolution of prokaryotes, particularly that of eubacteria, is primarily impacted by forces like HGT, and the GIs contribute vastly to this direction. With the availability of a deluge of whole genome sequence information due to next-generation sequence techniques, the puzzle of bacterial evolution and speciation has started to unwrap, albeit slowly. With more than 400,000 prokaryotic whole genome sequences in global databases at present, biologists have an enormous amount of data to analyze and decipher newer paradigms in bacterial evolution.