Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Introduction

Type I TA loci consist of two genes: one encoding a small, hydrophobic potentially toxic protein (the toxin) often 60 amino acids or less; and a second encoding a small RNA (the antitoxin) encoded on the opposite strand of DNA to the protein gene. These sRNA antitoxins act by base pairing to complementary sequences within the toxin mRNA. This leads to the formation of a double-stranded RNA, which can repress translation of the toxin mRNA and/or destabilize the toxin mRNA, leading to decreased levels of toxic protein. Type I loci, including hok/sok of R1 and fst of pAD1, were initially described on plasmids where they serve a role in plasmid maintenance (see Chap. 2). For clarity, I will refer to these as plasmid-based loci. Homologs of hok/sok and fst were later found encoded within bacterial chromosomes (Faridani et al. 2006; Pedersen and Gerdes 1999; Weaver et al. 2009).

Excitingly, in 2002, a new type I locus was described. This locus, ldr-rdl, had no discernable homology to any plasmid sequence and was found encoded within the Escherichia coli K-12 chromosome and related bacteria (Kawano et al. 2002). These chromosomal loci, with no known plasmid-encoded homologs, are herein referred to as the novel type I loci. Additional novel type I loci have since been discovered either by experimental approaches to identify sRNAs, experimental serendipity, or by bioinformatics approaches.

3.2 Features of Novel Type I Loci

The novel loci have many interesting features, some shared with the plasmid-based loci, and others unique to those encoded on chromosomes. The main feature of all type I loci is that the toxic protein is small (60 amino acids or less) and hydrophobic. Essentially, the proteins appear to be no more than a transmembrane domain, with either a short N-terminus or C-terminus tail. For those with C-termini tails, the C-terminus is typically rich in polar or aromatic residues (Fozo et al. 2010).

The novel loci are encoded distant from their flanking genes, often by as many as 200 nucleotides (nt) or more. This observation at first suggests that these are mobile genetic elements that can be acquired by horizontal gene transfer. However, for those examined, there is no significant difference between the G + C content of the locus and the surrounding chromosomal content. Also, there is no indication of insertion or repetitive elements reminiscent of horizontal gene transfer.

The novel toxins possess rather long 5′ or 3′ untranslated regions (UTRs). The UTR in many cases is actually longer than the coding sequence of the toxin gene. Usually, the RNA antitoxin bind to these long UTRs. The UTRs are folded into stable stem-loop structures that, in the cases of the 5′ UTRs, sequester the ribosome binding site and start codon and thereby inhibit translation. Thus, the UTRs can repress toxin translation, independently of the action of the antitoxin RNA.

Type I antitoxins function similarly to the well-studied chromosomally encoded sRNAs. However, the base pairing potential of an antitoxin for its target is much more extensive than that of a conventional sRNA. Typical sRNAs pair over 6–12 nt and often require the RNA chaperone Hfq to stabilize the interaction (Waters and Storz 2009). The antitoxins have a much greater region of complementarity to their targets, from 18 to 60 nt or more. This extensive complementarity is likely why the known antitoxins do not rely on the bacterial protein Hfq to facilitate their interactions with their targets.

Another interesting feature of some of these chromosomal systems (like Ldr-Rdl, Ibs-Sib, and Zor-Orz) is that they are tandemly duplicated within the same intergenic region. Strains may also contain multiple copies of the same locus scattered about the genome. Why these loci are duplicated is not known; in some cases, the duplicated copies are practically identical in sequence, whereas for others, the loci are rather divergent in sequence. Whether these divergent homologs have unique cellular functions remains an unexplored area of research.

The organization of the novel type I loci can be varied. For example, some antitoxins are encoded antisense to the long 5′ or 3′ UTR or directly antisense to the coding region of the toxin mRNA (Table 3.1). In a few cases, the antitoxin is encoded divergently to the toxin. Such an arrangement has not been reported for plasmid-based type I toxins. In cases where the sRNA is not encoded directly antisense to the toxin mRNA, one may ask, “how can there be pairing and regulation?” In these cases, the antitoxin RNA has 18–21 nt of perfect complementarity to the toxin mRNA, allowing for regulation. A more detailed discussion of these unusual loci will follow below.

Table 3.1 Novel Type I Loci

3.3 Classification of Novel Loci Based on Gene Arrangement

Based upon the genetic arrangement of the novel loci, herein I will classify the novel loci into two main categories: conventional novel loci and the unconventional novel loci. The conventional loci are arranged such that the antitoxin is encoded opposite to the 5′ or 3′ UTR or the coding region of the toxin (Table 3.1). This is the most common organizational structure and is also seen with the plasmid-encoded loci. These toxin families also tend to be widespread in nature. The unconventional loci were discovered recently. Here, the antitoxin is encoded divergent from the toxin, but they do share some extensive sequence complementarity. Fewer of these loci have been identified and their taxonomic distribution is more limited.

3.4 Conventional Loci

3.4.1 Ldr-Rdl Family: The First Family to be Described

The Ldr-Rdl family was the first novel type I locus discovered (Kawano et al. 2002). Originally, three Long Direct Repeat (LDR) sequences were noted within the E. coli strain MG1655. These repeats, LDR-A, LDR-B, and LDR-C, are approximately 530 nt in length, and are found in tandem to each other on the E. coli chromosome. In their analysis, Kawano et al. detected a fourth repeat (LDR-D) sequence, encoded distal from the previous three (Kawano et al. 2002). They noted two distinct transcripts encoded opposite of each other within the LDR-D repeat. One of these transcripts contained an open reading frame, named ldrD, that encoded a highly hydrophobic, 35 amino acid peptide. The ldrD mRNA was shown to have a long 5′ UTR (180 nt) and encoded antisense to the UTR is rdlD. RdlD, the second transcript, was shown to be a sRNA. The same genetic arrangement was noted in the other three LDR repeats. Given the similarities to the plasmid-encoded type I loci, the authors hypothesized that this was a toxin–antitoxin locus, and designed a “rescue” experiment to prove this (Fig. 3.1). Overexpression of LdrD was toxic to E. coli, but co-expression of the sRNA RdlD prevented this toxicity (Kawano et al. 2002). A final regulatory consequence noted was that RdlD had a much shorter half-life than the ldrD mRNA, similar to what is reported for the hok toxin-encoding mRNA and Sok antisense sRNA, where hok mRNA has a much longer half-life than Sok-RNA (Gerdes et al. 1990). It is probable that the instability of RdlD is important for allowing translation of the LdrD mRNA during specific growth conditions.

Fig. 3.1
figure 1

Typical rescue experiment to confirm the presence of an antitoxin gene. The toxin gene is cloned behind an inducible promoter on a plasmid. The antitoxin is cloned behind a different inducible promoter on a compatible plasmid. Induction of the toxin gene alone results in inhibition of cell growth due to cell stasis or death. Co-expression of an antitoxin prevents the inhibition

RdlD can potentially base pair to the 5′ UTR of ldrD, approximately 45–100 nt upstream of the start codon of ldrD. How then could RdlD prevent expression of ldrD? Scrutiny of the DNA sequence revealed a potential small open reading frame, denoted ldrX that is entirely located within the pairing region RdlD (Gerdes and Wagner 2007). The authors hypothesized that translation of ldrX is necessary for translation of ldrD. This situation appears to be analogous to the hok mRNA; here the 5′-end of the small open reading frame mok, which is required for translation of hok, overlaps with the Sok pairing region (Thisted and Gerdes 1992).

There have been attempts to understand how overproduction of LdrD leads to cell death. Quite soon after overexpression of LdrD, DNA condensation is observed; this feature is not shared by other toxins, and indicates that LdrD may have unique targets in the cell (Kawano et al. 2002; Fontaine, unpublished observation). Although highly hydrophobic, overproduction of LdrD caused significantly slower dissipation of proton motive force when compared to other type I toxins (Fontaine and Fozo, unpublished observation). Combined, this suggests that the membrane is not the primary target of LdrD.

The biological function of the Ldr family is still unknown. Deletion approaches have not yielded strong phenotypes; however, only a limited number of growth conditions have been tested (Hobbs et al. 2010; Kawano et al. 2002). Rather little is known about transcriptional control of this family; further work in this area could give clues to function as well.

The Ldr toxin family is much broader that initially anticipated. Using a bioinformatics approach to identify all possible Ldr homologs, it was discovered that the Ldr and the Fst family (Fst was originally described as a plasmid-based toxin, see Chap. 2) are actually part of the same “superfamily” (Fozo et al. 2010). This finding was especially interesting when considering the effects of LdrD and Fst-induced toxicity: both cause DNA condensation when overproduced. Again, this feature is not found in other type I loci, suggesting unique biochemical properties of these proteins.

3.4.2 TxpA-RatA: Gram Positives Get into the Act

The first novel chromosomal locus found in a Gram-positive bacteria was TxpA-RatA, discovered in an sRNA screen in Bacillus subtilis. An RNA encoded by the intergenic region of ybqM (txpA)-ybqN was detected by microarray analysis. Further work confirmed that this sRNA (RatA) is encoded convergently to txpA and that the 3′ ends of the antisense and mRNA overlap by approximately 75 nt (Silvaggi et al. 2005; Table 3.1). This is very similar to the fst locus of plasmid pAD1 of Enterococcus faecalis (Weaver et al. 1996). Given the similar arrangement to fst, and that txpA encodes a 59 amino acid hydrophobic protein, the authors examined whether txpA-ratA constitues a bona fide type I toxin–antitoxin locus (Silvaggi et al. 2005). Overproduction of TxpA caused cell death in B. subtilis, whereas co-expression of RatA prevented cell death. Furthermore, a strain deleted for RatA has increased levels of txpA mRNA compared to the parental strain. This suggests that binding of RatA to the toxin mRNA leads to degradation of the toxin-encoding mRNA; hence, the increase in txpA levels in a ratA deletion strain.

Functional analysis has been accomplished with a ratA deletion strain (Silvaggi et al. 2005). The authors noted lysis of ∆ratA on agar plates after several days of growth. This phenotype could be complemented by providing RatA in trans. Additionally, suppressor colonies arose that did not lyse. Sequencing of these strains revealed mutations in txpA, including premature stop codons and missense mutations. These results confirm the regulatory role of RatA and demonstrate the inherent toxicity of txpA overexpression.

How TxpA overproduction causes cell death is not clear. It is highly hydrophobic, and may potentially integrate into the membrane. This, along with the observed lysis phenotype, suggests that TxpA forms pores in the cell membrane. Surprisingly, two separate studies confirmed that TxpA overproduction in E. coli has no impact on cell growth, suggesting that the target of TxpA is missing in E. coli, or that TxpA does not reach toxic levels in E. coli (Fozo et al. 2010; Silvaggi et al. 2005).

The number of TxpA homologs across a wide array of species has grown considerably through novel bioinformatic analyses. In a search to discover new toxin–antitoxin loci, the gene ef3263 from E. faecalis V583 was identified as a putative toxic protein (Fozo et al. 2010). This protein was then used as “bait” in a modified PSI-BLAST search. In this approach, the top hits of the search were then used as “baits” to pull out all possible homologs to EF3263. This repetitive search continued until no further homologs were obtained. Through this approach, it was shown that EF3263 is a distant homolog of TxpA. The searches revealed that TxpA is rather well represented across bacterial species and in fact there are six family members within the chromosome E. faecalis V583 alone (Table 3.1). Of note, the sequence of the family members is incredibly diverse, even within the same strain (Table 3.2).

Table 3.2 Chromosomal TxpA homologs of E. faecalis a

It is important to understand the biological function of TxpA. The locus is encoded within skin, a large phage-like element within the B. subtilis chromosome (Silvaggi et al. 2005). During sporulation, the skin element is excised from the mother spore, but remains present in the forespore, and will be present in future germinating cells. One hypothesis is that txpA-ratA functions similarly to plasmid addiction loci; it kills off the mother cell, while allowing the forespore to survive. Studies examining whether or not a txpA-ratA locus deletion strain is deficient in sporulation and/or causes the death of the mother cell are needed to confirm this hypothesis.

What about the function of TxpA in species beyond B. subtilis? The sequences of TxpA homologs are quite diverse, varying greatly in length and amino acid content. Additionally, species like E. faecalis do not form spores, nor do they have a skin element within their chromosome. The sequences of the six TxpA members of E. faecalis are very different from one another (Table 3.1). Overproduction of these proteins in E. coli and E. faecalis has shown that only one is toxic to either of these bacteria (Fozo et al. 2010; Miracle and Fozo, unpublished observations). Given the lack of toxicity and the diversity in sequences, it is possible that these homologs have evolved separate functions. Thus, although txpA in B. subtilis may have a role in spore formation, the role(s) in other species is a mystery.

3.4.3 Ibs-Sib: The Smallest Toxins

Serendipity has also played a hand in the discovery of novel type I loci. Upon completion of the E. coli K-12 genome, a series of four repeat sequences were denoted. The sequence repeat was referred to as the QUAD repeat, and analysis noted that within each repeat was strong promoter elements, but no open reading frame following the promoter sequence was observed (Rudd 1999). Thus, the QUADs were hypothesized to encode stable RNAs, and several papers did report detection of the putative RNAs (Argaman et al. 2001; Hershberg et al. 2003; Rivas et al. 2001; Wassarman et al. 2001). Following revision of the genomic sequence, a fifth repeat was found, and the name was changed to the SIB (short intergenic abundant sequences) repeat (Fozo et al. 2008b). The repeats are found in E. coli strains and related species, and are often repeated multiple times in the genome, sometimes in tandem to each other.

Overexpression of the Sibs gave an unexpected phenotype, and sequence gazing revealed that, encoded on the opposite strand relative to each sib was a small open reading frame (Table 3.1). These reading frames encode 18–19 amino acid proteins that were highly toxic to E. coli when overexpressed. However, expression of the Sib RNAs could prevent this toxicity (Fozo et al. 2008b). These small proteins were consequently named Ibs (inhibition brings stasis).

The Sib antitoxins, unlike the other type I antitoxins described to date, completely overlap the coding sequence of the Ibs mRNA (Table 3.1). Interestingly, two forms of the Sib RNAs can be detected: a full-length of approximately 140 nt, and a shorter 110 nt form, due to differences in the 3′ end of the RNA. The shorter form mapped such that it ends in the predicted ribosome binding site for the cognate toxin. Processing of the ibs-Sib RNA complex likely leads to a shorter Sib form. An rnc deletion strain, which lacks RNase III, does have higher levels of the full-length Sib RNAs than a wild-type strain, suggesting that ibs mRNAs and Sib antisense RNAs form duplexes that are cleaved by RNase III (Fozo, unpublished results).

Given that the Sib RNAs are very similar to each other, there could be cross talk in their regulation of the Ibs. Initial experiments showed that this was not the case. For example, SibC could prevent IbsC-mediated toxicity, but not IbsE-toxicity (Fozo et al. 2008b). An additional study mapped the region of specificity to the most variable region of Sib sequence (Han et al. 2010). They showed that two regions of the Sib RNA make contact the ibs target initially, similar to what was reported with the interaction between RNAII and RNAI encoded by fst (Greenfield et al. 2000; Greenfield and Weaver 2000). These two initial contact domains (TRD1 and TRD2) were critical for the specificity observed for a Sib sRNA (Han et al. 2010).

Overproduction of the Ibs toxins leads to a rapid depolarization of the cell membrane and increased expression of the psp operon (Fozo et al. 2008). The psp operon is induced in response to stresses that impact membrane integrity and/or proton motive force. This induction correlates well with the observed membrane damage induced by the Ibs proteins.

The Ibs proteins are incredibly small and hydrophobic, yet rather toxic upon overproduction. A mutagenesis study was established to determine what residues were critical for Ibs toxicity (Mok et al. 2010). Incredibly, multiple single amino acid substitutions could be made, and toxicity was still maintained. Residues within the putative transmembrane domain, however, were critical for toxicity. These residues may be important for membrane localization or could play a role in protein–protein interactions.

What is the biological function for the Ibs? This question still remains unanswered. However, some clues were obtained using a strain in which the promoter of sibC was mutated, while leaving ibsC intact, and leading to increased ibsC mRNA levels. Although no major growth effects were observed, there was elevated expression of the psp operon, suggesting that the cell membrane integrity was compromised (Fozo et al. 2008b). Indeed, a greater portion of the mutant strain (7.5 % of the population) had higher levels of proton motive force dissipation as compared to the wild-type strain (1 % of the population; Fozo and Fontaine, unpublished observations). Although membrane damage was seen only in a subset of the population, it does suggest that elevated levels of IbsC can have detrimental effects, and that IbsC expression could be uneven within a population of cells. Further experiments examining sensitivity of the strain to physiological conditions known to induce the psp response (ethanol, heat, etc.) did not show any significant effects/differences compared to a wild-type strain.

One major difficulty in elucidating the true function of this locus is the inability to detect the ibs mRNA and protein E. coli. Detection of ibs transcripts via northern was possible only upon deletion of the cognate sib promoter (Fozo et al. 2008b). Mapping of the transcription start site of ibs was possible only by the use of multicopy plasmids (Han et al. 2010). The promoter identified is neither very strong nor was there evidence for binding sites of known transcriptional regulators.

3.4.4 BsrG-Sr4: Limited to Spβ Prophage

Screens to identify sRNAs, along with sequence analysis, led to the recent discovery of another novel type I system in B. subtilis, the BsrG-Sr4 locus (Jahn et al. 2012). Reports had indicated that there was an sRNA (Sr4) encoded by the intergenic region of bsrG-yokL of B. subtilis (Irnov et al. 2010; Saito et al. 2009). Jahn et al. demonstrated that the two RNAs (bsrG and Sr4) converge at their 3′ ends, and that they overlap by approximately 120 nt (Jahn et al. 2012). Similar to other type I TA loci encoding 3′ overlapping toxin mRNAs and antitoxin RNAs, RNA pairing appears to induce degradation of the target RNA. The bsrG gene encodes a small protein of 38 amino acids. Overproduction of the small protein was highly toxic to Bacillus, but co-expression of the SR4 RNA could alleviate this toxicity (Jahn et al. 2012).

Interestingly, the level of the toxin-encoding mRNA drops dramatically in response to heat shock. Careful analysis revealed that this is not owing to repression of transcription, but rather to RNA instability that is independent of SR4 (Jahn et al. 2012). How this phenomenon is related to function is not clear.

Functional studies of BsrG produced results similar to what was seen with TxpA. A strain deleted for SR4 had much higher levels of bsrG mRNA. The strain also produced cell lysis on agar plates, as was seen with TxpA (Jahn et al. 2012; Silvaggi et al. 2005). This lysis occurred more rapidly than in the case TxpA, suggesting that either the total levels of BsrG were higher than TxpA or that BsrG itself is more toxic. Suppressors did arise easily in this strain, and many were mapped to mutations in the coding sequence itself—including frame shifts and premature stop codons.

What is the biological function for BsrG? The toxin is encoded by the SPβ prophage element of B. subtilis. Homologs are found only within the prophage, which is present in very few species. Given its location in the genome, the locus may serve to maintain the prophage within the population; however, further experiments are needed.

3.4.5 The YhzE Family: A Nontoxic Toxin?

YhzE, a 28 amino acid hydrophobic protein from B. subtilis ssp. subtilis str. 168, was identified in a computational search as a putative type I toxin (Fozo et al. 2010). Although yhzE was annotated, there was a clear, duplicated unannotated homolog encoded in tandem. To distinguish between these genes, the previously annotated gene is now referred to as yhzE-1 and the newly identified gene as yhzE-2 (Fozo et al. 2010). Many homologs were found across various species of Firmicutes and B. subtilis 168 has eight separate members of this family. A highly expressed sRNA encoded convergently to the 3′ end of yhzE-1 was detected by Northern analysis, suggesting that expression of yhzE-1 may be regulated by this RNA. However, overexpression of YhzE-1 in E. coli did not impact growth, similar to what was reported for TxpA and BsrG, nor was toxicity observed in B. subtilis (Jahn et al. 2012; Fozo et al. 2010; Silvaggi et al. 2005).

3.4.6 YonT: Spβ Prophage Déjà Vu

YonT was identified in the same bioinformatic search as YhzE (Fozo et al. 2010). Like bsrG, yonT is encoded by Spβ prophage. Northern analysis detected an approximately 120 nt sRNA convergently transcribed relative to the toxin-encoding gene (overlapping 3′ ends). Overproduction of YonT was toxic in E. coli, something rather unusual for the previously described type I toxins from Bacillus.

So what is the function of YonT? This locus has not yet been characterized phenotypically or biochemically. Given its location within SPβ, it may serve to maintain the prophage within the population and it is possible that the bsrG-sr4 and yonT loci both contribute to the maintenance of the prophage.

3.5 The Unconventional Loci

For unconventional type I loci, the toxins are encoded divergently from their antisense RNAs. Thus, the base pairing potential between these gene pairs is much more limited compared to conventional type I loci. These families are much more limited in their distribution and are found mainly in E. coli, Shigella, and Salmonella species.

3.5.1 TisB-IstR-1: SOS-Induced Toxin

The TisB-IstR-1 pair was discovered in a screen to identify novel small regulatory RNAs in E. coli K-12 (Vogel et al. 2004; Argaman et al. 2001). Initially, the authors detected an sRNA (IstR-1) divergent from the uncharacterized operon of ysdAB, now known as tisAB (Vogel et al. 2004). The tis mRNA is composed of two overlapping genes, tisA and tisB, that encode short proteins. Overexpression of tisAB is very toxic to E. coli, and this toxicity is shown to be due to tisB, and not tisA (Vogel et al. 2004). In fact, tisA appears not to be translated, and it may function solely to insure translation of tisB (Darfeuille et al. 2007). Co-expression of the sRNA IstR-1 can repress TisB toxicity.

Control of tisB expression has been thoroughly investigated. Accessibility of the tisB ribosome binding site (RBS) is rather limited owing to the secondary structure of the mRNA as well as binding of the antitoxin IstR-1 RNA. However, toe printing experiments revealed that the ribosome could bind far upstream of the tisB RBS, and this region mapped to the tisA RBS (Fig. 3.2; Darfeuille et al. 2007). Further characterization showed that the tisA RBS serves as a “stand-by” ribosome site; the ribosome is unable to bind at the “real” site due to obstruction of the site and instead binds an upstream “stand-by” site (Unoson and Wagner 2007). The ribosome is essentially in a holding position until the obstruction over the true RBS is relieved (in the case of tisB, the folded mRNA structure breathes), allowing the ribosome to slide and begin translation at the correct site.

Fig. 3.2
figure 2

Model describing regulation of translation of tisB. Indicated are the “stand-by” ribosome binding site of tisA and the true ribosome binding site for tisB (see text and Darfeuille et al. 2007 for details)

So what is the function of TisB? When overproduced, TisB localizes to the inner membrane of E. coli, leading to membrane damage (Unoson and Wagner 2008). Consistently, overproduction of TisB induces transcription of genes that respond to membrane damage (Fozo et al. 2008b). TisB overproduction also leads to reduced replication, transcription, and translation rates but these are probably indirect effects (Unoson and Wagner 2008). Taken together, membrane damage, along with the effects on macromolecule biosynthesis explain how TisB overproduction could cause cell death.

However, what happens when TisB is expressed at endogenous levels, and not from a multicopy plasmid? Transcription of tisB is induced by DNA damage, and thus studies have focused on whether TisB plays a role in the SOS response (Vogel et al. 2004). One study examined competition between a wild-type strain and one in which istR-1 was deleted. This mutant strain has elevated levels of tisB compared to the wild-type strain. In response to long-term growth and repeated exposure to the DNA damaging agent mitomycin C, the mutant strain was eventually outcompeted by the wild type (Unoson and Wagner 2008). A study published by a second group reported that overproduction of TisAB could lead to inhibition of an SOS response (Weel-Sneve et al. 2008). Together, these studies suggest that the tis-istR-1 locus is important for proper responses to DNA damage.

A final study has linked tisB expression to persistence (see Chap. 11). The majority of cells treated with the antibiotic ciprofloxacin, which causes DNA damage, are killed; however, a subpopulation known as persisters can survive this treatment. Deletion of tisB led to a dramatic reduction in the number of persister cells formed upon ciprofloxacin treatment. How TisB can lead to the formation of persister cells is not clear, but perhaps TisB expression causes slow cell growth, allowing them to survive, and recover from DNA damage.

Taken together from multiple studies, it appears that induction of tisB in response to SOS damage is an important component for cell fitness. However, these studies also show that tisB levels must be tightly regulated since its expression could be detrimental.

3.5.2 ShoB-OhsC: Another Case of Tight Translation Control

Similar to the TisB-IstR study, a cloning-based screen used to identify novel sRNAs in E. coli was instrumental in discovering yet another type I locus, shoB-ohsC (Kawano et al. 2005). These authors detected two RNA divergent transcripts encoded by the intergenic region of yfhL-acpS. That share a 19 nt region of sequence complementarity (Fig. 3.3). One of the genes, named shoB, encodes a putative 26 amino acid hydrophobic peptide while the second gene, ohsC, encodes an sRNA. Similarity searches show that this locus is limited to E. coli and Shigella species, and is thus even more limited in its distribution than the tisB-istR-1 locus (Fozo et al. 2008b, 2010; Kawano et al. 2005).

Fig. 3.3
figure 3

Genetic organization of the divergently transcribed shoBohsC genes of E. coli K-12 MG1655. The shoB mRNA is shown as a gray arrow pointing left-ward while the ohsC antisense gene is shown as a gray arrow pointing right-ward. The shoB reading frame is indicated by the black arrow and the 19-nucleotide regions of complementarity between shoB mRNA and OhsC antisense RNA are indicated by the white boxes

Overproduction of ShoB was very toxic to E. coli, but this toxicity could be repressed by overexpression of OhsC-RNA. Additionally, ShoB overproduction led to a rapid decrease in proton motive force, suggesting that cell death was likely due to membrane damage (Fozo et al. 2008b).

The endogenous expression of shoB mRNA and its antitoxin OhsC are interesting in that their expression patterns are reciprocal; in cells grown in minimal medium, shoB mRNA is readily detected in exponential phase, but not stationary phase. In contrast, the OhsC-RNA is readily detected in cells grown to stationary phase (Kawano et al. 2005). Translational control of ShoB may be related to that of tisB (reviewed in Fozo et al. 2008a). Like in the case of tisB, there are multiple 5′ ends for the shoB transcript. Analysis of translational reporter fusions showed that the full-length shoB mRNA is not translated, whereas a 5′-trancated mRNA was, albeit at a low level. Translation of shoB was increased in an ohsC deletion strain, confirming that OhsC functions to repress shoB (Fozo et al. 2008b). But even without OhsC present, translation of the toxin was still quite low. Thus, overall translation of ShoB is highly repressed, and this is independent of the antitoxin. One clue as to how this repression occurs is due to the predicted secondary structure of the shoB mRNA: the mRNA is tightly folded so that the ribosome binding site and start codon are sequestered by stable stem-loop structures (Fozo et al. 2008a). The biological function of ShoB is not yet known.

3.5.3 Zor-Orz: The Newest Members

In a search for new type I loci, two hydrophobic, duplicated annotated proteins within the E. coli O15:H7 EDL933 genome, z3289 and z3290, were identified (Fozo et al. 2010). These proteins are 29 amino acids in length, encoded in tandem and differ by a single amino acid change. Homologs are found within pathogenic E. coli and commensal strains, and Shigella species. In some cases, only one of the genes is present. However, neither gene is present in laboratory strains such as E. coli K-12 MG1655. Northern analyses confirmed the presence of two sRNAs, encoded antisense and divergent of the annotated proteins (Table 3.1).

Overproduction of either protein was toxic to MG1655. To determine whether the putative antitoxins OrzO (originally denoted as sRNA-1, divergent to z3289) and OrzP (originally denoted as sRNA-2, divergent to z3290), are indeed antitoxins, a rescue experiment was performed (Wen and Fozo, unpublished data). The sRNAs could indeed repress toxicity associated with overproduction of zorO (z3289) and zorP (z3290) (Fig. 3.4).

Fig. 3.4
figure 4

Zors are type I toxins with unusual arrangements. a Repression of ZorO-induced toxicity. One plasmid contained zorO cloned behind the arabinose-inducible PBAD and the other plasmid contained either orzO (in red) or orzP (in blue) cloned behind an IPTG-inducible P lac promoter in E. coli MG1655. Addition of 0.2 % arabinose or 1 mM IPTG is indicated by the arrows. The closed symbols indicate no IPTG added; the open symbols indicate the addition of IPTG. b Schematic of the zor-orz locus of E. coli O157:H7 EDL933. The toxin open reading frames are indicated by the black boxes; the arrow indicates the start of toxin transcription. The genes encoding the antitoxins are in gray. The –10 and –35 indicate the positioning of the predicted sigma-70 promoter elements. Note that the –35 for zorO and orzO overlap, as does the –35 for zorP and orzP

As with the multiple ibs-sib loci, the two zor-orz loci are highly similar. This similarity raises the question: do the antitoxins cross-regulate expression of the noncognate toxin? Preliminary experiments indicate that this not the case; only OrzO can repress zorO expression (Fig. 3.4). Mutagenesis experiments have narrowed the region required for regulatory specificity and current work is in progress to unravel the exact details for this specificity.

Remarkably, the antitoxins share putative overlapping -35 promoter elements with the cognate toxin –35 promoter element (Fig. 3.4). One could imagine competition for RNA polymerase binding between the divergent but adjacent promoters (Fozo et al. 2010).

3.6 The Big Question: What are Type I TA Loci Doing in Bacteria?

Elucidating the biological role of these genes has been challenging. Unlike the toxins of type II loci, there is no similarity to known enzymatic domains or well-characterized proteins to give guidance for biochemical studies. Type I toxins essentially resemble a transmembrane domain. The speculation has long been that they simply form pores in the membranes of cells.

If they simply form pores in membranes, than why are there so many families and so many duplicated families within one genome? For example, E. coli O157:H7 EDL933 has 6 Ibs-Sib, 2 Zor-Orz, 4 Ldr-Rdl, 1 Tis-IstR, and 1 ShoB-OhsC. The limited data generated show the different toxin families are expressed under different conditions in the cell. To elucidate their function, a variety of approaches must be used. There has been recent success in using large-scale approaches to identify phenotypes of small RNAs and small proteins; similar approaches may be useful for phenotypic characterization of the type I loci. More in depth gene expression studies may also provide useful; they have been extremely beneficial for elucidating potential functions for TisB. Biochemical approaches as well must be considered. Traditionally, work with small proteins has proven challenging, but there has been some recent success using epitope tagged proteins to identify both localization and putative interacting partners (Hemm et al. 2008; Ramamurthi et al. 2006, 2009).

3.7 Finding More

Novel chromosome-encoded type I toxins has been discovered via by identification of sRNAs, by serendipity or through a bioinformatic approach. Why were they missed in the first place? An open reading frame is typically not annotated unless it is larger than 50 codons, thus leaving many small genes unnoticed. Additionally, the type I toxins identified have been found only in γ Proteobacteria and Firmicutes. Does this mean that type I TA loci are limited to these groups? Likely not: given their small sizes and divergent sequences, bioinformatic searches are somewhat limited in predicting homologs. Relatively, few changes in sequence can cause such search algorithms to fail, thus limiting the number of putative homologs that are identified. For example, both BLAST and PSI-BLAST using the Ibs as bait failed to identify the Ibs homologs in Helicobacter pylori (Fozo et al. 2010, Sharma et al. 2010). Additionally, traditional BLAST searches were unable to identify that ef3263 is a member of the txpA family of B. subtilis.

Identification of the unconventional chromosomal loci is difficult by using bioinformatic tools. Attempts to design algorithms to find similar loci are not trivial, because the identification of limited stretches of similarities is challenging. The ShoB-OhsC locus was identified in a search for sRNAs and then it was noted that there was sequence complementarity between the divergent genes. For the zor-orz pairs, the proteins were identified as having properties common to type I toxins, but the Orz sRNAs were identified only through using RNA folding algorithms of the region.

So, do type I loci exist in other species? It is my belief that new families in additional bacterial species do exist and eventually will be identified. For example, RNA sequencing data are now revealing many overlapping, small transcripts in a variety of bacteria and some of these are predicted to be type I loci (Steglich et al. 2008). Our knowledge regarding the novel type I toxin-antitoxin loci is still in its infancy. However, given the recent advances in identification and functional characterization, it is likely that the function of many known, and yet to be discovered loci, will be revealed in the upcoming years.