Introduction

The external environment has enormous ability to regulate the cellular environment. Organisms develop survival strategies and adjust their equilibrium states in response to environmental disruptions or insults. Such changes cannot be explained solely by conventional methods, necessitating the use of the term ‘epigenetics’. The Greek prefix ‘epi’ stands for ‘on’ or ‘in addition to’ and the word ‘genetic’ means ‘pertaining to or produced from genes’. In the early 1940s, Conrad Waddington coined the word ‘epigenetics’ and defined it as ‘the branch of biology that studies the causal interactions between genes and their products that bring the phenotype into being’ [141]. Thus, epigenetics attempted to unite genetics and developmental biology in the early decades and to provide new insights into the mechanisms for unfolding the genetic programme for development. In the last two decades of the twentieth-century, significant progress has been made regarding the relationship between DNA methylation and gene expression in various biological contexts and the methodology for studying epigenetics was established [56].

The description of phase-variation of pyelonephritis-associated pili in Escherichia coli provided the first explanation of the bacterial lineage development governing DNA methylation [12]. Bacteria, like many other eukaryotes, use post-replicative DNA methylation to govern the epigenetic regulation of DNA–protein interactions. In bacteria, rather than DNA cytosine methylation as seen in eukaryotes, DNA adenine methylation acts as an epigenetic signal. Methylation performed by a group of enzymes known as Mtases, is the most studied epigenetic signal in prokaryotes. As a methyl donor, S-adenosyl-L-methionine (SAM) is used. In eukaryotes, the dominant DNA modification is 5-methylcytosine (5mC), where a methyl group is transferred from SAM to an unmodified cytosine (C5) [75]. 5mC is not prevalent in prokaryotes, while being necessary in eukaryotes. Instead, 4-methylcytosine (4mC) is a kind of cytosine methylation found in prokaryotes, with Mtases modifying C4 and N6-methyladenine (6 mA) resulting from the transfer of a methyl group from SAM to adenine (N-6) [24]. The DNA adenine methylase (Dam) is the first orphan methyltransferase found in prokaryotes, which methylates adenine in the 5/-GATC-3/ motif [53].

To adapt to threats within and between human hosts, human-adapted bacterial pathogenic strains exploit a process called phase-variation to spontaneously switch the expression of individual genes, resulting in a phenotypically varying population. Gene expression varies between active (ON phase) and inactive (OFF phase) states in phase variation. In uropathogenic E. coli (UPEC) cells, pilus phase- variation, for example, can be seen using immunoelectron microscopy. Changes in nucleotide sequence (e.g., site-specific recombination and mutation) can cause phase-variation, resulting in heritable changed gene expression. Bacteria also control phase-variation through epigenetic mechanisms [10, 38, 51, 125]. In every case study, these systems use DNA methylation patterns to communicate information about the mother cell's phenotypic expression status to the daughter cells. The binding of regulatory protein(s) to a location that overlaps a methylation target inhibiting methylation, results in a DNA methylation pattern. This pattern can influence gene expression if methylation alters regulatory protein(s) binding to its DNA target site, which can happen owing to steric hindrance or methylation-induced changes in DNA structure [109]. Restriction-modification (R-M) systems governing phase variable expression are observed more often, producing global variations. According to an analysis of phase variable expression of R-M systems (Type-I and Type-III) on pathogenic bacteria adapted to humans, multiple gene expressions are controlled by methylation modification. Phasevarion is the name given to these structures (phase-variable regulons). Evading host immune response is facilitated by phase variation of switching between virulence and avirulence phenotype. Hence, phase variation through methylation will enable seemingly isogenic bacterial populations to bring about multiple epigenetically distinct subpopulations [37, 51, 138]. Recent advancements in technologies have enabled high throughput sequencing to analyze methylomes and epigenomes. Single-Molecule Real-Time (SMRT) facilitated the detection of modified bases other than A, T, C, and G [6, 30]. An overview of bacterial epigenetics has been presented in Fig. 1.

Fig. 1
figure 1

An overview of Bacterial epigenetics There are two broad mechanisms of epigenetics in bacteria: DNA modification and RNA modification. DNA modification is widely studied and involves methylation of nuclobases and alteration in sugar-phosphate backbone through phosphorothioation. These two modification machineries interact with each other. RNA modification occurs through methylation of adenine and post transcriptional modification (capping)

Infections have recently arisen as one of the causes that might drastically cause alterations in epigenetic patterns. Epigenetic deregulations caused by bacterial infection may affect host cell function, allowing pathogen persistence or encouraging host defense. As a result, pathogenic bacteria could be regarded as epimutagens or epigenome modifiers. Their impacts may leave particular, long-lasting imprints on host cells, resulting in an infection memory that affects immunity and may be at the root of inexplicable disorders [7]. Acquisition of immunological memory of the host innate system, for example, is heavily reliant on epigenetic imprinting. It's also becoming clear that inadequately controlled epigenetic processes play a role in the onset of major diseases including cancer and autoimmune diseases. Indeed, a growing body of evidence suggests that changes in the cellular epigenome caused by infection are linked to pathologic processes in infected animals, whether acute or chronic. In persistent infections, where the pathogen manipulates the host cell for survival, this idea is crucial. Given the inheritability of epigenetic modifications, another intriguing theory is that epigenetic changes linked with infection may cause host cell functioning to change even after the illness has been treated [40].

DNA methyltransferases of the restriction modification system

DNA methyltransferases, which catalyze the addition of a methyl group to DNA, are the most common type of post-replicative nucleotide alteration seen in the genome of prokaryotes and eukaryotes. They are a two-component system having an endonuclease activity that explicitly cleaves DNA and a DNA methyltransferase that performs methylation at the exact specific location, thereby protecting other cognate restriction enzymes [112] and execute the role of the immune system in prokaryotes. Although the prime function is to protect the genome, R-M systems are involved in recombination, transposition, maintaining species identity, and many more. In the case of Neisseria meningitidis infection, commensal isolates and pathogenic isolates are naturally prevented from transferring DNA amongst each other as the commensal isolates have DNA adenine methyltransferase methylating 5/-GATC-3/ sites while the pathogenic isolates harbour a restriction endonuclease for the same site. Hence, the bacterial lineage identity is maintained [62].

Bacteria and archaea possess multiple types of R-M systems. Helicobacter pylori has more than 20 presumed R-M systems which cover even more than 4% of their total genome [82]. Restriction enzymes have been classified into four types based on their number of subunits and organization, regulation, cofactors, specificity and catalytic mechanisms (Table 1). Type-I systems target bipartite motifs and cleave several kilobases (Kb) away from the non-methylated motif site [84]. In the type-II system, cleavage occurs close to or within the non-methylated motif sites [108]. Type-III systems are complexes with multiple modification subunits and restriction subunits. Their methylation targets are short, non-palindromic sequences and the restriction subunit targets the non-methylated motif and cuts 25 bp away from the motif [113]. DNA cleavage by the Type IV system occurs only when the recognition sequence is methylated [117].

Table 1 Classification of Restriction enzymes

Solitary or orphan methyltransferases

The methyltransferases that lack restriction endonuclease activity is termed as solitary or orphan methyltransferase [24]. The orphan methyltransferases are involved in regulating chromosome initiation, gene expression, DNA mismatch repair, and cell cycle progression.

Dam methyltransferase

The first orphan methyltransferase found in prokaryotes (E. coli) is deoxyadenosine methyltransferase (Dam) which is involved in methylating N6 adenine in the 5/-GATC-3/motif. E. coli dam, checks 3000 dam sites for every binding event in a random walk on lambda DNA with 48,502 bp and 116 dam sites, resulting in processive methylation of 55 sites on average.

DNA methylation is further accelerated by processive methylation.The extremely processive mechanism of E. coli dam could explain why during DNA replication, modest amounts of dam in E. coli can keep dam sites methylated. Dam's transformation into a processive enzyme appears to be beneficial in keeping up with E. coli's rapid growth. As a result, the right quantity of Dam at the right moment is critical for mismatch correction to work well. The importance of processivity is that it accelerates the process of DNA methylation and brings about a methylation pattern where fully methylated and unmethylated strands alternate [136]. Dam methyltransferase plays a role in mismatch repair. E. coli cells without dam were found to be alive, but had a higher rate of spontaneous mutations, implying that methylation plays a function in DNA repair [91]. Unmethylated 5/-GATC-3/ is recognized and cleaved by MutH (mismatch repair protein) while the parental strand remains intact [80]. The expression of the dam gene is governed by five promoters and regulated by the rate of growth of bacteria [20]. Dam directly regulates some phase variations in E. coli through DNA methylation [51]. Phase variation is a genetic regulatory system in which distinct, stable expression levels of particular genes occur in subpopulations of bacteria within a bigger population. By effectively having different pre-acclimatized groups of cells, these systems allow bacteria to cope with rapidly fluctuating surroundings.

The enzyme Dam has nine amino acid sequence motif and a highly conserved –DPPY- involved in the binding of SAM [88]. Throughout the Gammaproteobacteria (orders Enterobacterales, Pasteurellales, Vibrionales, Alteromonadales and Aeromonadales in particular) excepting one clade, homologs of the Dam or dam-like genes are associated with restriction endonucleases suggesting their role in R-M systems. Dam homologs are identified as orphan methyltransferases in a unique clade of Gammaproteobacteria, including E. coli and Salmonella. Other characteristics shared by this clade include the presence of MutH, SeqA and the dam gene organized in an operon with aroB and aroK in addition to an overabundance of 5/-GATC-3/ sites in oriC, genes surrounding oriC, and the dnaA promoter [16]. Homologs of SeqA and MutH are essential for survival in some members of Gammaproteobacteria like Yersinia pseudotuberculosis and Vibrio cholerae, but not essential for viability in E. coli and Salmonella [91]. In Salmonella, DNA methylation is critical for virulence and also for the transcription of regulator genes engaged in the conjugal transfer of the virulence plasmid pSLT [24, 50, 105]. Several genes linked to virulence were found to be downregulated in dam mutants, including pmrB, which functions in resistance to host defence peptides like as defensins, spvB, which is necessary for bacterial growth during infection, and mgtA, for an ATPase involved in magnesium transport [50]. The release of effector proteins including SipA, SipB, and SipC, which are involved in host cell invasion, was also reduced in Dam mutants [43]. Dam mutants' viability is unaffected, but their pathogenicity is reduced (Table 2) [43, 50]. In E. coli, three other methyltransferases- Yhdj, Dcm, and HsdM have been characterized. Yhdj is an orphan methyltransferase and facilitates the methylation of the second adenine of the 5/-ATGCAT-3/ motif and is a non-essential enzyme [17]. Dcm is also an orphan methyltransferase methylating the second cytosine in the 5/-CC(A/T)GG-3/ motif [96]. It is not essential for viability but found to be regulating genes coding for RpoS which is a stress response sigma factor and many of its target genes during the stationary phase [64]. The third adenine methyltransferase, HsdM also belongs to the R-M system [3].

Table 2 Insights into DNA methylation types

Cell cycle regulated methylase (CcrM)

CcrM was first identified in Caulobacter crescentus and targets 5/-GANTC-3/ where N is any nucleotide. It has a slightly greater affinity for hemimethylated DNA and uses SAM as a donor of the methyl group [119, 126]. Agrobacterium tumefaciens, Ensifer meliloti, and Brucella abortus all have CcrM homologs. It is only present for a short time during the cell cycle, coinciding with transcription and translation [152]. The C-terminal end of CcrM, essential for methyltransferase activity, has four conserved motifs prevalent among all homologs of Alphaproteobacteria. CcrM is involved in regulating cell division proteins FtsZ, MipZ, and FtsW. Some studies suggest that CcrM is required for growth, while others suggest that it is not [45, 106]. CcrM regulates the activity of two global cell cycle regulators, GcrA and CtrA [39, 114].

Both the methyltransferases, Dam and CcrM are involved in similar catalytic reactions and similar target sequences but they do not share similar ancestral origins. From an evolutionary point of view, they have independently originated. Dam and CcrM act on hemimethylated substrates, but the interaction of adenine with guanine occurs differently [2, 5]. Dam interacts with 5/ guanine on the non-target strand, two base pairs away from the target adenine while CcrM recognizes guanine adjacent to adenine on the same strand. In the Dam bound state, the paired bases in the DNA remain unaltered, but the binding of CcrM creates a bubble by pulling the two strands apart and such a separation leaves four unpaired bases. These differences between the two enzymes allow CcrM to be active on the double-stranded and single-stranded DNA as well as on mismatch regions [5759, 145].

DNA repair machinery

Dam mutants of E. coli and S. enterica express a hypermutable phenotype which indicates that these mutants cannot undergo mismatch repair. Base repair or nucleotide excision repair machinery cannot repair these excess transition mutations; as such mismatches generated from replication involve normal bases and there needs to be a distinction between the error-free template and error-prone strand (daughter). Dam hemimethylation enables this discrimination in E. coli and other representatives of Gammaproteobacteria. MutS recognizes the mismatch bases and recruits MutL and MutH, forming a ternary complex. In the non-methylated strand, MutH cleaves the phosphodiester bond near the 5/ end of G of 5/-GATC-3/. A helicase (UvrD) removes MutH from the ternary complex and unwinds DNA. The single-stranded gap is filled by DNA Polymerase III and DNA ligase seals the nicks. Transformation to a fully methylated state is carried out by the Dam. MutH cannot cleave a methylated DNA; hence the short period of hemimethylated state post-replication is targeted by MutH to form the ternary complex [111]. Dam mutants are subjected to double-stranded cleavage by MutH [9092].

As the duration of the hemimethylated state shortens, overproduction of Dam methylase also hinders MutH from repairing mismatches. Because unmethylated 5/-GATC-3/ can no longer operate as a signal for MutH cleavage, the repair mechanism requires a specific amount of Dam methylase to work [83].

Cell cycle

Dam methylase readers DnaA and SeqA control the initiation of replication and segregation of chromosomes during replication. DNA replication initiates on the binding of DnaA to the origin of replication (oriC) site when 5/-GATC-3/ sites within oriC are methylated. In the hemimethylated state, oriC is inactive. Dam activity is delayed by SeqA which binds to the hemimethylated 5/-GATC-3/ and acts as a repressor of the dnaA gene. SeqA sequesters DNA replication along with the role to play in condensation, cohesion, and segregation. The binding of SeqA to hemimethylated 5/-GATC-3/ behind the replication fork directs the nucleoid organization [142].

SeqA acts as a check to the over-initiation of DNA replication by preventing quick methylation of oriC and a constant presence of Dam throughout the cell is needed for this function. In the absence of Dam, mismatch repair protein-induced-cleavage might turn out to be lethal for the cell.

N(6)-methyl-adenines are epigenetic signals that affect cellular activities such as chromosome replication initiation and gene expression by interacting between regulatory DNA regions and regulatory proteins [31]. Dam in Gammaproteobacteria and CcrM in Alphaproteobacteria mediate DNA adenine methylation. CcrM is cell cycle regulated, whereas Dam is active throughout the cell cycle, which is a significant difference. GANTC sites in alpha-proteobacteria can be hemi-methylated for a long time during the cell cycle, depending on where they are on the chromosome. Except for regulatory 5/-GATC-3/ sites that are protected from Dam methylation of certain DNA-binding proteins, most 5/-GATC-3/ sites in gamma-proteobacteria are only transiently hemi-methylated [31].

Detection of epigenetic tags

Traditionally, several approaches have been used for characterizing DNA methylome. Restriction enzyme-based mapping is a reliable and robust detection method but the limited availability of restriction enzymes with known specific target sequence limits its usage [35]. A gold standard for detecting 5mC is bisulfite sequencing, but it is unable to detect 6 mA. Detection of 4mC requires additional conversion steps [25, 64, 151]. Quantification of 5-methylcytosine can be performed by a combination of specific antibody and fluorescence stain [140]. Several commercially available kits for ELISA (Enzyme-Linked Immunosorbent Assay) also enable to screen global DNA methylation [76]. The classical dot blot assay has an equivalent sensitivity of 0.15 ng as ELISA and provides a measurement of 5mC within a very short period [85]. DNA quantification by multiple monitoring in a mass spectrometer (MS-MRM) and coupled with nano-ultra HPLC enables to determine the abundance of various DNA bases along with their modifications [78]. Single-molecule real-time sequencing (SMRT) can report all three types of methylations with the highest sensitivity for 6 mA and lowest for 5mC which requires additional deep sequencing steps. At single-molecule and single-nucleotide resolution, both the methylated sites and motifs can be identified through SMRT. Nanopore sequencing allows direct identification of DNA and RNA base modifications. The challenge of inferring epigenetic data from long reads and phasing, and mapping multiple methylated bases has been conquered by the advent of nanopore sequencing.

Methylome studies have enabled the identification of a diverse set of Mtases and their specific target sequences. A wide range of diverse bacterial and archaeal epigenomes revealed that 93% of the genomes have DNA methylation. The spread of this diversity is primarily driven through horizontal gene transfer of mobile genetic elements containing MTase [11, 32, 68, 69].

Epigenetic regulation

Phase variation and bistability

A hallmark of bacterial cells is to undergo persistence, a phenomenon of metabolic arrest. Under unfavourable conditions, the cells shift to an inactive state and revert to a normal active state once conditions are favourable. This bifurcation of clonal cells or populations is known as bistability and the reversal to the original phenotype is known as phase variation. Persistence is a bet-hedging strategy of bacterial cells to survive fluctuations and enabling the expression of multiple variations of stress-responsive genes. Variations may be mediated by gene inversions, complex recombination systems, methylation or demethylation of promoters, slipped-strand mispairing, and transposons [37, 138]. The presence of simple DNA sequence repeats (SSR) tracts in the open reading frame (ORF) of a gene, lead to the ON state (expressible). A frame-shift mutation downstream of the SSR results in the OFF phase and forms a truncated protein. Manifestation of phase variation also occurs through reversible inversions of DNA regions [1, 33, 73, 98, 99, 110, 121, 123, 128, 134].

Epigenetic invertons are DNA sequences that contain host specificity (hsd) genes flanked by inverted repeats and undergo inversions catalyzed by invertase. Invertons were found to regulate genes for antibiotic resistance and in the presence of antibiotics in human gut, a shift from the OFF to ON phase occurs. However, when antibiotics are absent, however, invertons enable colonization and survival of the pathogen. This trade-off enables to reduce the fitness cost of maintaining genes for antibiotic resistance. Hence, it can be concluded that invertons are vital for the co-existence of host and microbe and also pose a serious threat to combating antibiotic resistance [150].

Phase variations occur at high frequencies of greater than 105 per generation giving rise to phenotypic heterogeneity. It is a phenomenon of stochastic switching of gene expression from an ‘ON’ to an ‘OFF’ state, altering the transcription between different states [51]. All the diverse conserved genes responsible for phase variations within a species and genus are collectively known as the phasome and the combined expression states of different phases are defined as phasotype [107]. Hypermutations in DNA methyltransferases lead to such frame-shift mutations. In N. meningitidis expression of N6 -adenosine DNA methyltransferases (Mod) has been observed due to hypermutation, leading to loss or gain of DNA repeats within the ORF (open reading frame) and causing frameshift mutation [63]. Such changes alter the ON/OFF state of expression, causing changes in expression profiles of nutrient acquisition, virulence, metabolic processes, and colonization within the host. It has been found that ModA11 and ModA12 increase susceptibility to antibiotics like ceftazidime, ciprofloxacin, nalidixic acid, cloxacillin, and doxycycline. A wide range of MICs is observed for different phasevarions. The regulation of gene expression epigenetically through DNA methylation may act in synergy with mutations and increase the spectrum of antibiotic resistance [11]. Methylome analysis may help to identify the modulation of the antigenic profile of the population and how invasive meningococci might emerge over time.

In Streptococcus pneumoniae, phase variations are controlled by the ‘locus inverting’ type-I phase variable system. The multiple variable hsdS genes shuffle and recombine to produce different methyltransferases with unique specificities. Such a random recombination of the hsdS gene in the SpnD39III locus leads to the formation of six distinct target specificities for methyltransferases. The expression of several genes involved in the various nutrient acquisition, stress response as well as capsule biosynthesis were differentially regulated by these variants [89].

Pseudomonas aeruginosa, an opportunistic pathogen infects a wide range of tissue in an immunocompromised host. To quickly adapt to a harsh environment, genetic regulations do not prove to be effective. Hence, phenotypic variation allows P. aeruginosa to cope with such changing environment. The cupA gene encoding fimbriae formation for surface attachment during biofilm formation is controlled by MvaT [135]. A bistable switch of cupA genes can be speculated to contribute to cell fitness. In the ON phase, the cells may enable biofilm formation while in the OFF phase the cells may lead to persistence. Another bistable switch in P. aeruginosa is the LysR-type transcriptional regulator BexR which encodes a set of genes including aprA for alkaline phosphatase. The bistable expression of BexR can cause several downstream gene expressions, including aprA for virulence factor alkaline phosphatase, detoxification, and metabolism of certain small molecules, indicating the vital role of BexR expression or repression for viability and virulence of the strain [135].

The Type VI Secretion System (T6SS) present in most Gram-negative bacterial cell envelopes are involved in virulence and defense against a host. There occurs a cross-talk between T6SS and other factors related to virulence involved in biofilm formation, invasion, adhesion, toxin secretion. Fur, the ferric uptake regulator and the sci1 T6SS gene cluster in Enteroaggregative E. coli (EAEC), are the major players in control of the epigenetic switch. The T6SS gene cluster sci1 undergoes Switch ON and OFF in response to the availability of iron and DNA replication [19]. In the presence of iron, Fur acts as a repressor that binds to the Fur-box region in iron-regulated promoters. The Fur-box is a 19 nucleotide consensus binding motif 5/- GATAATGATAATCATTATC-3/. During the unavailability of iron, Fur releases from the promoter region, allowing the binding of RNA polymerase and continue transcription. The promoter of the sci1 gene cluster has three 5/-GATC-3/ motifs, the putative sites for methylation by Dam at the N6 position of the adenine. Isoschizomer digestion assays and electromobility shift assays demonstrate that Dam methylation and the affinity of Fur for Fur-box binding is inversely related i.e., the higher affinity of Fur for Fur-box prevents methylation of 5/-GATC-3/ motifs and vice versa. Hence, there exists a regulatory switch causing the transition from ON to OFF state depending on the iron concentration [100].

The uropathogenic E. coli–UPEC responsible for urinary tract infection expresses pathogenicity factors like biofilm formation (for colonizing in the urinary tract), pili type I, adhesion P (pap) and siderophore aerobactin (aer) [47, 48, 118]. Phase variation regulates the expression of Pap pili [139] and this is further controlled by DNA methylation [15]. Pili expression in E. coli is not expressed constitutively and is regulated by external stimuli mainly temperature and concentration of glucose. At 37 °C in a low-glucose medium, cells express (ON) P pili while at high glucose and 26 °C or even at 37 °C, cells do not express (OFF) pili. Approximately 18,000 5/-GATC-3/ sites in E. coli are methylated by the Dam. In the regulatory region of the pap operon, two 5/-GATC-3/ sites are protected differentially [103].

Glycosyltransferase operons (gtr) are involved in regulating O-antigen composition of LPS in Salmonella. The gene cluster (gtrP22) of the O1 serotype of phage P22 Salmonella is controlled by epigenetic modification and regulated by a coaction of the Dam and oxidative stress regulator (OxyR). Dam targets four 5/-GATC-3/ motifs (arranged in two pairs) located 115 bp upstream of gtrP22 transcription start site [18]. On comparing OxyR consensus binding site and gtr sequence, three blocks of sequences each with 9–10 nucleotides were found to be conserved among them. OxyR interacts with the DNA binding motifs as a dimer of dimers, forming three binding sites- OxyR(A), OxyR(B), and OxyR(C) and are termed as binding half-sites [131]. OxyR(A) overlaps 5/-GATC-3/ 1/5/-GATC-3/ 2 and OxyR(C) overlaps 5/-GATC-3/3/5/-GATC-3/4 suggesting competition between ligands’ binding to each of the overlapping sites [18]. 33 bp upstream of the 5/-GATC-3/4 is the transcription start site and the promoter has OxyR(BC) overlapping the 5/-GATC-3/4 site, suggesting epigenetic regulation by the Dam and OxyR. In the ON phase, OxyR binds to OxrR(AB) as 5/-GATC-3/1 and 5/-GATC-3/2 are unmethylated while methylation of 5/-GATC-3/3 and 5/-GATC-3/4 prevents OxyR binding to OxyR(C). In the OFF phase, 5/-GATC-3/1 and 5/-GATC-3/2 undergo methylation, and 5/-GATC-3/3 and 5/-GATC-3/ 4 are unmethylated. In this case, OxyR -OxyR(A) binding is inhibited by methylation while OxyR-OxyR(BC) remains bound. It is understood that methylation of 5/-GATC-3/ motifs prevents OxyR binding and once it binds, it does not allow Dam activity [18].

Epigenetic regulation of spore formation

Clostridium difficile is one of the most common nosocomial infectious agents, posing serious healthcare implications. Their ability to resist antibiotics is through sporulation. An extensive methylome analysis revealed a conserved m6a Mtase (CD2758) among approximately 300 published sequences. C. difficile genomes and shared a common motif (5/-CAAAAA-3/) for methylation. Sporulation was hindered by the inactivation of Mtase (CD2758). Previous studies indicate that methylase gene of C. difficile does not have any cognate restriction-modification enzymes and hence, solely controls transcriptional regulation [52]. Thus, in stressful environments, epigenetic modifications help pathogenic organisms to survive. Mutants of Mtases have been shown to under-express the sporulation-specific genes and decreased the efficiency of spore formation (Fig. 2). RNA-seq detailed a new dimension to the study, by proposing that the Mtase (CD2758) presumably have pleiotropic effects along with sporulation. The over-expression of 5/-CAAAAA-3/ in regions encoding motility, biofilm formation, membrane transport, and in vivo colonization further supports its pleiotropic role [104].

Fig. 2
figure 2

Epigenetic regulation of sporulation in bacteria On exposure to antibiotic stress, Mtases methylate m6A which controls the overexpression of sigma factors responsible for sporulation.Mtase mutants show reduced spore formation and under expression of sigma factors

Phosphorothioation and methylation

The DNA degradation (Dnd) phenotype was observed on electrophoresis of genomic DNA in Streptomyces lividans and it was assumed to be a part of the post-replicative modification machinery [49]. Phosphorothioation (PT) is a DNA alteration method in which a sulfur atom replaces the non-bridging oxygen in the sugar-phosphate backbone. A series of DndABCDE and SspABCD proteins are used in the process, which is highly sequence-specific [143, 148, 149]. DndABCDE mediates PT modifications in 4-bp consensus motifs, such as 5'-GPSGCC-3′/5′-GPSGCC-3′ in P. fluorescens pf0-1, 5'-GPSAAC-3′/5′-GPSTTC-3′ in Salmonella enterica serovar Cerro 87 and E. coli, and 5/-GPSATC-3//5/-GPSATC-3/ in Hahella chejuensis KCTC 2396 (Table 3) [22, 28, 133]. SspABCD modifies PT in single-stranded DNA at 5′-CPSCA-3′, but no alteration occurs in the complementary strand 5′-TGG-3′ [21, 148].

Table 3 Role of DndABCDE proteins in Phosphorothioation

DndA and SspA are cysteine desulfurases while DndC and SspD are ATP pyrophosphatases. These functional similarities hints towards a probability that both the PT machineryies have a common sulfur mobilization mechanism but diverge at the steps involved in the selection of DNA targets. SspB is a DNA-nicking nuclease having a distinct role in the ssDNA PT modification, but such a role has not yet been observed in the Dnd system [148].

Both the PT machineries adopt different strategies to confer the protection of the self DNA from foreign DNA. In the Dnd system, a cluster dndFGH lies near the dndBCD operon, and the 3-gene product (dndFGH) engages in PT modification to differentiate and destroy unmodified non-PT invasive DNA [64]. However, in the absence of PT modification, unrestrained DndFGH damages DNA, triggering SOS response, filamentation of the cell, and induction of prophage formation [22, 42]. The Ssp system adopts an unusual defense strategy by employing the SspE restriction component for exerting antiphagic action. The GTPase activity of SspE senses sequence-specific motifs and introduces nicks in the phage DNA [148].

Briefly, two unusual features distinguish PT modifications from the classical R-M systems. Phosphorothioation was observed in 10–15% of 5/-GAAC-3/or 5/-GTTC-3/ throughout the genome in Cerro 87 and 5/-CCA-3/ in FF75. In B7A, of the 40,701 possible PT modification sites, only 12% were PT protected. In a population of DNA molecules, despite the presence of restriction counterpart DndFGH and SspE, the PT profiles were heterologous [21, 144, 146]. Although PTs are effective in R-M systems, they are also prevalent in bacteria lacking restriction genes, according to recent research. The methylation consensus sequences of R-M systems and MTases are different [11, 72]. R-M systems and PT modification are both found in the same core consensus sequence, as has been pointed out. This emphasises the fact that such interactions are often indicative of functional cooperation between the two systems [27, 143].

Two questions arise regarding the proximity of both the DNA modifications at the same consensus sequences. Firstly, what the mechanism of target selection is in this situation, and secondly, how the modification function is altered due to this proximity. To investigate this, a model system was built with a Dam from E. coli DH10B and PT modification from H. chejuensis KCTC2396, both of these which recognize and modify 5/-GATC-3/. In vitro analysis of Dam interaction with PT substrate GAPSTC measured using LC–MS/MS revealed inhibition of Dam activity irrespective of the substrate being double-stranded or single-stranded. DNA modification by PT i.e., the substitution of sulphur, thus interfered with recognition of Dam methylase or its catalytic efficiency in vitro. The interaction of both the modifications was studied in vivo and the results contradicted with the in vitro setup. In E. coli HST04, mapping 6 mA 5/-GATC-3/ sites depicted a total modification of 37,610 5/-GATC-3/ sites of 37,698 sites, suggesting that the inhibitory effect of PT on Dam modification was overcome in vivo or methylation was synchronized to occur either along with PT or before PT modification. Since 6 mA methylation by Dam occurs immediately after replication, there lies a possible interpretation that PT and 6 mA have been tandemly inserted in hemimodified sites quickly after replication. SMRT sequence platform analysis to identify the location of the PT and Dam modification targets showed that PT modification was present only in a fraction of 5/-GATC-3/ sites modified by the Dam. As a result, Dam and Dnd DNA modifications use different target selection mechanisms [24, 143].

The proximity of the two modifications further called for an assessment of the effect of 6 mA on PT modification. Two distinct outcomes were noted. No lethal effect was noticed due to DndFGH restriction proteins and the absence of PT protection. Bactericidal activity was restored on an altered temperature exposure (Fig. 3). Such thermoregulation of resistance has also been studied in LmoH7 and LlaJI [72, 101]. 5/-GATC-3/ methylation protected against DndFGH in the absence of PT to almost the same extent as that would have been expected in the presence of PT protection [24, 25, 143].

Fig. 3
figure 3

Temperature dependence of DndFGH restriction system In the absence of PT, DndFGH has lethal effect. But at 37 °C no toxic or lethal effect was observed. On changing the temperature to 15 °C, bactericidal effect of DndFGH was noted

Bacterial epigenetics: A front-line research domain in infection biology

Epigenetic deregulations induced by bacteria, apparently, affect host cell function, either by supporting host defense or by permitting persistence of the pathogen. As a result, pathogens can be regarded as possible epimutagens capable of reshaping the epigenome. Their effects may leave precise, continuing impressions on host cells, ensuing in an infection memory that controls immunity and may be the source of infection [41]. It is likely that even after pathogen elimination, the imprints due to a bacterial infection be passed down to the next generation through chromatin modifications, resulting in hereditary changes in gene activity. Hence, it is crucial to determine if it is or was true for histone modification and or DNA methylation signatures inflicted by the bacterial constituents are preserved throughout time. Infection by UPEC caused induction of DNA methylation impacting genes engaged in host cell proliferation. Upregulation of DNA methyltransferase (DNMT) activity and DNMT1 expression in human uroepithelial cells as well as CpG methylation along with downregulation of a G1-cell-cycle inhibitor regulator CDKN2A was evident following infection with UPEC [130]. The enhancer of Zeste homologue 2, which is involved in early host cell proliferative reactions following infection, is altered epigenetically by UPEC-induced paracrine factors [128]. By preventing infection-induced host cell apoptosis, may boost uroepithelial cell growth and pathogen persistence.

Bacteria-mediated epigenetic alterations can also affect other tissues, such as the placenta. In fact, the IGF2 gene promoter that became imprinted in mouse placental tissue is hypermethylated as a result of maternal infection with Campylobacter rectus [13]. This finding implies that bacterial infections during pregnancy may have an epigenetic impact on genes involved in embryonic development. Tissue damage, multiorgan malfunction, septic shock and mortality result from abnormal inflammatory reactions in response to long-term exposure to microorganisms and microbial metabolites like butyric acid and lipopolysaccharides (LPS). Butyric acid, a short-chain fatty acid that acts as a powerful inhibitor of histone deacetylases (HDACs), is one such bacterial metabolic product that can function as a chromatin-modifying enzyme [116]. The effect of Porphyromanas gingivalis on the activities of latent viruses such as the human immunodeficiency virus (HIV) and Epstein–Barr syndrome (EBV) appears to be linked to the bacteria's production of butyrate [60, 61]. It is hypothesized that viral genes that have been silenced by HDAC-holding complexes are reactivated when HDACs are inhibited by butyric acid. Infection with P. gingivalis could thus be a risk factor for viral infections like AIDS or Herpes. The anti-inflammatory effects of butyric acid on the host are likely attributable to epigenetic up-regulation of anti-inflammatory genes. These findings raise the intriguing prospect of using butyrate-producing probiotic bacteria as immunosuppressors [81]. The immune system also has a post septic immunosuppression (PSI) set-up that allows hematopoietic cells to turn into temporarily lessresponsive to recompense for these negative consequences. This anti-inflammatory benefit lessens the severity of sepsis but it also makes patients more vulnerable to opportunistic infections for long periods of time, even years. The significance of epigenetic regulation is being more extensively understood, despite the evidence that PSI is a complex multifactorial process, as recently reviewed [23, 94]. In people, LPS tolerance can endure for weeks, but it's unclear if this memory is passed down during cell division. Furthermore, despite the division of imprinted hematopoietic cells, it is undecided why new cells produced from bone marrow- progenitors are incapable of reconstructing a functional immune system. Epigenetic imprinting may also occur at the level of stem cells, which is an appealing idea. To put this theory to the test, the epigenome of stem cells obtained from sepsis animal models should be examined. Understanding how ‘imprinted’ immune cells regain equilibrium is dependent on the conversion of heterochromatin to euchromatin at genes suppressed by LPS. Shigella produces peptidoglycans, which Nod1 recognises and activates the NF-kappaB pathway, resulting in an intense inflammatory response. Following Shigella infection, an E3 ligase effector, IpaH9.8 was discovered to suppress the NF-kappaB-mediated inflammatory response in a novel way [102]. Alternative splicing occurs in several immunologically important genes [86]; it remains to be shown if bacteria could influence alternative splicing through chromatin alterations to deregulate immune system activity. The importance of alternative splicing in pathogenic infections is gradually becoming apparent. It has opened up a world of possibilities for future research into host–pathogen interactions, disease aetiology and treatment techniques. While it will be vital to see how disrupted signalling events during infections affect host RNA splicing, it will also be necessary to look at possible pathogenic agents that interact with the splicing machinery and alter the host response machinery [26].

Interaction of multi-drug resistant (MDR) bacteria’s with host cells can lead to pathogen persistence by guiding molecular perturbations of host transcriptional programmes including epigenetic-sensitive processes such as DNA methylation, histone modifications and non-coding RNAs. Mycobacterium tuberculosis, as well as H. pylori, E. coli, Listeria monocytogenes, P. aeruginosa, and Legionella pneumophila infection, have all been linked to epigenetic alteration by MDR bacteria, suggesting possible disease biomarkers [34]. The epigenetic alterations that occur in a host during M. tuberculosis (M.tb) infection and how M.tb influences the host epigenome have been discussed in a review where M.tb-induced epigenetic modifications that boost host defence or M.tb survival have been described [124]. Following the entry of M. tb into the cell, Rv1988 released by the bacterium, binds to chromatin in the host nucleus, altering the expression of host defence genes [66]. For example, TRAF3, in concert with TRAF2, is crucial for cell type-and stimulus-specific synthesis while NOX1, NOX4, and NOS2 are major sources of reactive oxygen species [4, 147]. Rv2966c, a secretory mycobacterial protein, has been the subject of some interesting research. This 5-methylcytosine-specific DNA methyltransferase produced by M. tb can be found in the nucleus of infected mammalian cells. Rv2966c binds to particular DNA sequences and primarily promotes non-CpG methylation; its activity is regulated positively by phosphorylation [122]. This protein, similar to Rv1988, can interact with histone proteins and both are likely critical components of the first impact during infection, when the host defence control centre is hijacked by epigenetically modifying its behaviour [67].

According to a study that looked at alterations in the phosphoproteome of gastric cells after infection with H. pylori, it is found that RNA processing and splicing factors are enhanced in infected cells [55]. The bacterium H. pylori is also a significant risk factor for stomach cancer. Aside from the effects of H. pylori on cell proliferation and DNA integrity, unusual DNA methylation brought about by the pathogen appears to be a key mechanism in causing gastric cancer [36, 137]. Surprisingly, in human patients, eradication of H. pylori infection results in a reduction, but not complete removal, of methylation of promoter CpG islands of genes associated with the progress of gastric cancer [97]. This is demonstrated by the fact that bacterial infection can leave epigenetic marks in a tissue, allowing for long-term changes in gene expression. Given that cancer must arise from a cell that can divide, this reprogramming done by the bacteria may be encouraged in long-lived cells such as stem cells or progenitors and then passed down to daughter cells. Adult epithelial cells may be targeted for dedifferentiation if modules of the stem cell signalling network are suppressed, allowing for enhanced propagation and survival [65]. CagA, a prominent H. pylori virulence component, has been identified as a substantial contribution to H. pylori infection-mediated DNA damage repair modulation, potentially altering the balance between DNA damage and repair, favouring genomic instability and carcinogenesis, according to a new study [71]. H.pylori infection is also found to be associated with E-cadherin methylation, reduction of the expression of the transcription factor USF1 (shown to stabilise p53 in response to genotoxic stress) and reduction of WW domain containing oxidoreductase expression through promoter hypermethylation (an epigenetic mechanism also occurring in many other cancer cells) in the gastric mucosa [34]. Likewise, E. coli infection has been associated to a higher risk of bladder cancer [132], and gut bacteria may predispose to colon cancer [127].

L. monocytogenes targets the host chromatin and epifactors by different mechanisms. Listeria's use of cytosolic signalling pathways or direct targeting of epifactors in the nucleus to influence the expression of host genes at the chromatin level has contributed to the birth of a new branch of science integrating cellular microbiology with epigenetics [8]. Molecular pattern recognition receptors activate mitogen-activated protein (MAP) kinase signalling pathways when they detect intracellular microbe-associated molecular patterns. This causes mitogen- and stress-activated kinase 1/2 to phosphorylate histone H3 on serine 10 and E1A to acetylate histones H3 and H4. Using the CREB-binding protein p300 and transcription of pro-inflammatory genes, the first of five mechanisms that mobilise the epigenetic machinery in response to Listeria factors has been revealed [9]. Listeria secretes effectors that initiate signalling cascades that lead to histone changes in the second mechanism. Listeriolysin-O toxin dephosphorylates and deacetylates histone H3 on serine 10 as part of a signalling cascade involving a K+ efflux at the plasma membrane. Internalin B stimulates the c-Met-Phosphatidylinositol 3-kinase (PI3K) pathway, causing two silencing information regulators, 2SIRT2, to translocate to the nucleus and deacetylation of Histone H3 acetylated at lysine 18H3K18. In the second pathway, Listeria secretes effectors that trigger signalling cascades that result in histone modifications. As part of a signalling cascade involving a K + efflux at the plasma membrane, the listeriolysin-O toxin dephosphorylates and deacetylates histone H3 on serine 10. Internalin B activates the c-Met-Phosphatidylinositol 3-kinase (PI3K) pathway, resulting in the translocation to the nucleus of two silencing information regulators, 2SIRT2, and deacetylation of Histone H3 acetylated at lysine 18H3K18. The expression of a group of defence genes is suppressed by these pathways [8]. Listeria infection in epithelial cells induces interferon signalling pathways, which is the third mechanism. An unknown signalling route recruits the histone deacetylase (HDAC) complex to interferon-stimulated genes (ISGs). This results in histone H3 deacetylation and ISG transcriptional inhibition. The nuclear targeted protein A gene (lntA) of Listeria is expressed in the fourth mechanism. The nucleomodulin, LntA, binds to and inhibits the bromo adjacent homology domain-containing 1(BAHD1) histone deacetylase (HDAC) complex in the nucleus, restoring histone H3 acetylation at lysine 9H3K9 and result in increased ISG production. Listeria secretes the nucleomodulin open reading frame X (OrfX) in macrophages, which binds to the nuclear regulator ring1–YY1-binding protein (RYBP) in the fifth mechanism. The connection between OrfX and RYBP increases P53 stabilisation, which alters the formation of reactive oxygen and nitrogen species (ROS and RNS) derivatives [8, 9].

The effect of P. aeruginosa released extracellular vesicles (EVs) on DNA methylation in human lung macrophages lead to a defective innate immune response [79]. EVs contain proteins for antibiotic resistance, host-microbe interactions and proteolysis [29], and these virulence factors are responsible for the declining pulmonary functions in patients with cystic fibrosis [14, 70, 87, 95, 115]. P. aeruginosa EVs have been demonstrated to be capable of modifying some host cell DNA methylation patterns using a genome-wide DNA methylation method. There were 1,185 differentially methylated CpGs found, with distal DNA regulatory elements such as enhancer regions and DNase hypersensitive sites significantly overrepresented. Remarkably, only one out of the 1,185 differentially methylated CpGs in conjunction with EV exposure was hypomethylated. CpGs that were significantly hypomethylated were linked to genes including AXL, CFB, and CCL23. After 48 h of P. aeruginosa EV treatment, gene expression analysis revealed 310 genes with significantly changed expression, with 75 genes upregulated and 235 genes downregulated. The DNA methylation and gene expression of several CpGs related with cytokines, such as CSF3, showed strong negative associations. As a result, bacteria's secreted products (EVs) can affect DNA methylation of the host epigenome [77, 79].

Numerous host proteins were found to be ubiquitinated, phosphorylated, lipidated, glycosylated, AMPylated, de-AMPylated, phosphocholinated, and dephosphocholinated by biochemicals produced by the intracellular pathogen Legionella pneumophila [120]. By secreting chemicals that imitate particular eukaryotic proteins, the pathogen regulates the ubiquitin signalling system. LubX proteins have two U- box domains (U-box1 and U-box2) that are similar to eukaryotic E3 ubiquitin ligases and can bind to Clk1 (Cdc2-like kinase 1) in the cell during L. pneumophila infection [74]. LegK1 and LegK2 are two of the most well-studied serine/threonine protein kinases in L. pneumophila. LegK causes NF-κB activation and the generation of pro-inflammatory cytokines. [44]. LegK2 kinase has been demonstrated to phosphorylate the host protein MBP but does not act in the NF-κB pathway [54]. Another approach utilised by L. pneumophila to increase pathogenesis and survival is host glycosylation, which regulates cell signalling or gene transcription. Lgt1, Lgt2 and Lgt3 are the three glycosyltransferases of Legionella known to inhibit eukaryotic protein translation by targeting eEF1A at Ser53 (eukaryotic translation elongation factor 1A), the most abundant protein synthesis factor [93]. Other post-translational changes of host proteins used by L. pneumophila include AMPylation (adenylylation) and de-AMPylation, as well as phosphocholination and dephosphocholination. The DrrA/SidM effector of L. pneumophila controls adenylylation of Rab1b protein, the host regulatory protein recruited to the Legionella-containing vacuole during infection. SidD protein, which removes the AMP moiety from the modified Rab1, is credited with de-AMPylation action. In the Rab1 and Rab35 GTPases, the L. pneumophila protein AnkX transfers a phosphocholine group from CDP-choline to a serine, whereas the Legionella Lem3 (lpg0696) effector, which has the opposite function as AnkX, may remove the phosphocholine moiety from Rab1 [46, 129].

Conclusion

Bacterial infection is increasingly being shown to play a function in altering the epigenetic information of host cells through a variety of ways. Due to technological advancements, it is now possible to map DNA methylation and histone modification profiles over the whole human genome. Studying bacterial modulation of epigenetic processes, for example, can aid in deciphering the fundamental principles governing their occurrence and regulation. Opportunities for therapeutic applications will emerge as a greater understanding of the linkages between bacterial infection and the epigenome develops, especially if epigenetic modifications can be reversed. Microbe-induced pathoepigenetic alterations that are rapidly eliminated may help to prevent persistent or latent infections, cancer and autoimmune disorders. New possibilities have emerged in the fields of bacterial pathogenesis and chromatin-based defence gene regulation. Some intracellular infections are able to escape the host's defence mechanisms that cause epigenetic changes. Modifications in chromatin structure and transcriptional levels of genes involved in the pathogenesis of many bacterial illnesses result from these changes. Pathogens influence the host cell in this way to ensure their own survival. Understanding the epigenetic repercussions of bacterial infection could lead to the development of new vaccines and therapeutic applications.