Key words

1 Introduction

1.1 Global Importance of Enterotoxigenic Escherichia coli

Enterotoxigenic E. coli (ETEC) are among the leading causes of diarrheal illness worldwide. These organisms are particularly prevalent in developing countries where basic sanitation and clean water are often limited. Here, these pathogens preferentially affect young children, many of whom continue to succumb to rapid dehydration resulting from severe diarrheal illness [1]. ETEC infections occur following ingestion of contaminated food or water, and these infections have emerged in recent years in the form of large-scale food-borne outbreaks in the industrialized countries including the USA, presumably due to importation of imported food.

1.2 Toxins Define the Enterotoxigenic E. coli Pathovar

The enterotoxigenic E. coli are a diverse collection of pathogens that are defined by the production of at least one of the three diarrheagenic toxins known as heat-labile toxin (LT), or the heat-stable toxins (STh, and STp) [2]. Heat-labile toxin shares approximately 80 % molecular identity with cholera toxin and both toxins activate production of the second messenger cAMP in target intestinal epithelial cells. Both heat-stable toxins, like the native human intestinal peptide guanylin, bind to the extracellular portion of guanylate cyclase C to stimulate the production of cGMP . Both cAMP [3] and cGMP [46] stimulate protein kinases that phosphorylate the cystic fibrosis transmembrane regulatory channel (CFTR), thereby enhancing export of chloride ions into the intestinal lumen. Concurrent inhibition of sodium hydrogen ion exchange results in a net loss of NaCl and water into the intestinal lumen with ensuing watery diarrhea.

All three toxin genes are usually encoded on extrachromosomal plasmids and are frequently flanked by insertion sequences (IS), implicating mobile genetic elements in evolution of these genetically diverse pathogens. Indeed, it has been suggested that the diversity of the ETEC pathotype of diarrheagenic E. coli is driven largely by widespread dissemination of toxin gene encoding plasmids among a diverse collection of E. coli host strains [7].

1.3 The Challenge Posed by Pathogen Diversity

In essence, it appears that a diverse population of E. coli can potentially serve as effective hosts for production of these plasmid-encoded toxins. This is perhaps best exemplified by the diversity of serotypes that are represented among the ETEC pathovar [8]. While some specific O and H serotypes are more common, more than 75 O-antigen serogroups and more than 50 different H serogroups are represented in ETEC. A single study of 100 ETEC strains in Egypt identified 59 different O:H combinations [9]. Large-scale whole-genome sequencing of ETEC and other pathogenic E. coli, as well as some commensal strains, has provided some additional insight into the nature of this diversity [10, 11]. When considered in general, the E. coli pangenome, or collection of all genes present in those genomes sequenced to date, is quite large [12]. Remarkably, as each new genome sequence is analyzed, an estimated 300 unique genes will be added to this “open” pangenome [10, 12]. The appreciable underlying genetic plasticity of E. coli coupled with horizontal transfer of essential virulence genes on mobile elements may suggest that the ETEC pathovar will constantly evolve as toxin genes and other essential features are transferred into new host backgrounds.

1.4 Limits on ETEC Diversity Imposed by Key Virulence Requirements

Despite the underlying plasticity of E. coli genomes, there are two constraints imposed on ETEC genome content. First, all E. coli have at their core a collection of approximately 2200 genes that are mostly involved in essential metabolic functions of these organisms [10, 12]. This core subset of genes, common to all E. coli, are at least theoretically devoid of viable vaccine targets since they are largely shared with commensal strains that are present in and a component of the natural human gastrointestinal microbiota.

The requirement for other virulence traits in addition to the toxins themselves imposes another potential constraint on pathogen evolution that is more relevant to antigen discovery, and vaccine target selection. While the genes encoding the known enterotoxins define the ETEC pathovar, genes that encode additional features required for effective delivery of toxin payloads to cognate receptors on the epithelial surface can essentially serve as diversity checkpoint. ETEC must survive ingestion, navigate to lumen of the small intestine, and ultimately interact directly with intestinal enterocytes [13] bringing the pathogen in close proximity to epithelial cell surface receptors for the delivery of LT and ST. The relatively small number of pathovar-specific features in addition to the known toxins suggests that there are only a limited number of ways in which this can be accomplished [10]. Within this context, it should be possible then to discover relatively conserved virulence molecules that are either exclusive to ETEC or shared with other diarrheagenic pathovars, but not commensal strains.

1.5 Immunologic and Structural Diversity of Major Vaccine Targets

The best-studied antigens of ETEC to date are the colonization factors (CFs), plasmid-encoded structures that have been a major focus of vaccine development. Perhaps the best published data in support of CFs as vaccine targets come from passive [14, 15] immunization studies demonstrating anti-CF antibodies administered orally to afford significant protection against experimental challenge with homologous strains of ETEC. A number of active vaccination studies with different vaccine ETEC constructs have yielded significant increases in anti-CF antibodies; however overall protection afforded by these vaccines has varied [1618]. Nevertheless, emerging data suggest that protection can be improved by targeting individual CF fimbrial subunits particularly tip adhesin structures [19, 20].

A variety of different fimbrial, fibrillar, long pilus [21] and small linear fiber [22] colonization factor structures have been described thus far. A formidable challenge to vaccine development based exclusively on colonization factors has been the significant antigenic heterogeneity exhibited by the CFs. Although more than 25 different antigenically distinct CFs have been described to date [23, 24], it has been estimated that approximately 40–50 % of ETEC strains do not make one of these established CFs [25, 26].

1.6 Identification of Novel Vaccine Antigens in ETEC

There is no single method for identification of putative vaccine antigens in ETEC , and the technology for antigen identification is evolving rapidly. The approach outlined here encompasses a variety of complementary methods that are currently being used to identify and validate candidate vaccine targets for ETEC. As shown in Fig. 1, these include classical genetic approaches, genomics, and whole-genome sequencing, transcriptomics, proteomics, and immunoproteomics using protein microarray technology. Each of these methodologies has specific merits as well as disadvantages to consider in developing antigen discovery platforms.

Fig. 1
figure 1

Methodologies employed to identify and characterize potential vaccine antigens for enterotoxigenic Escherichia coli

1.7 Classical Genetic Approaches

Transposon mutagenesis has been widely used in bacterial pathogenesis studies to identify virulence genes involved in pathogen -host interactions. TnphoA, originally introduced by Manoil and Beckwith [27] 30 years ago, has been extensively used for identification of surface antigens in Gram-negative pathogens, including the EatA autotransporter which is secreted by ETEC [28]. Briefly, TnphoA incorporates a truncated version of the gene for alkaline phosphatase (lacking a signal peptide-encoding region) within the Tn5 transposable element. Transposon “hops” which result in gene fusions of phoA with those encoding surface antigens can be detected by colony screening on agar containing the antibiotic in the transposon and an alkaline phosphatase indicator. Original versions of TnphoA included the transposase within the transposable element. This occasionally created problems in attempting to clone and sequence the insertions due to subsequent rearrangements or continued transposition. To avoid these problems and to establish a system where insertions could be easily cloned and sequenced, pTnphoA.ts was developed by placing the transposase gene on a temperature-sensitive suicide plasmid outside of the inverted repeats of the Tn5 transposable element [29] (Fig. 2). This temperature-sensitive “plasposon” has been used to identify a number of candidate antigens in ETEC including EtpA [29]. Because the temperature-sensitive origin of replication is contained within the transposed element, the region of insertion can easily be identified by restriction endonuclease digestion and re-ligation of the target DNA into a recombinant temperature-sensitive plasmid for isolation and subsequent sequencing as outlined in the protocol below.

Fig. 2
figure 2

TnphoA.ts mutagenesis strategy: (top) linear plasmid map of pTnphoA.ts. The transposon is shown between the two inverted repeat elements (ir, yellow). Within the transposon are a truncated version of the alkaline phosphatase gene (′phoA), a temperature-sensitive origin of replication (ts), beta lactamase gene encoding ampicillin resistance (ßlac), and a flp recombinase recognition site (frt). Encoded on the plasmid backbone outside the repeat element are the RP4 transfer locus (RP4 oriT), and the Tn5 transposase gene. Shown below the map are major steps in transposon mutant generation starting with transformation of the recipient strain at permissive temperature (30 °C). Selection for transposition events with productive fusions then takes place at higher temperature under antibiotic and XP (blue colony) selection. Resulting colonies can then be tested to identify mutants phenotypically different than the parental ETEC strain. Total genomic DNA (plasmid and chromosome) is then digested with a restriction enzyme that cuts outside of the transposon, re-ligated, and used to transform a laboratory cloning strain to ampicillin resistance at 30 °C. The resulting plasmid can be sequenced from to identify the gene interrupted by transposition, and the corresponding protein can be identified in GenBank

2 Materials

2.1 TnphoA.ts Mutagenesis Materials

  1. 1.

    DH10BT1 (pTnphoA.ts) available from Fleckenstein laboratory.

  2. 2.

    Target ETEC strain.

  3. 3.

    Electroporation apparatus, cuvettes.

  4. 4.

    Ampicillin.

  5. 5.

    Luria agar base: Tryptone (1 %), yeast extract (0.5 %), agar (1.5 %).

  6. 6.

    Luria agar plates containing ampicillin at final concentration of 100 μg/ml.

  7. 7.

    5-Bromo-4-chloro-3-indolyl-phosphate (p-toluidine salt) (XP) prepared as a 20 mg/ml stock solution in N,N-dimethyl-formamide.

  8. 8.

    Gene fusion indicator plates: XP final concentration 40 μg/ml, ampicillin 100 μg/ml (protect from light).

  9. 9.

    Sterile wooden toothpicks.

  10. 10.

    Restriction endonuclease(s), buffer.

  11. 11.

    Genomic DNA preparation kits (e.g., MasterPure DNA Purification Kit, Epicentre).

2.2 In Vitro Analysis Materials

  1. 1.

    Gastrointestinal cell line(s) which support bacterial adhesion and/or toxin delivery assays (see Table 3) seeded into 96-well tissue culture plates.

  2. 2.

    Luria broth (LB).

  3. 3.

    Sterile culture tubes (Falcon 2059 or equivalent).

  4. 4.

    Sterile Hanks’ Balanced Salt Solution (HBSS) (e.g., Life Technologies or equivalent containing Ca2+, Mg2+).

  5. 5.

    Triton-X-100 [0.1 %] sterile solution in PBS.

  6. 6.

    Luria agar plates.

  7. 7.

    Cyclic nucleotide assay kits.

    1. (a)

      cAMP (e.g., Arbor Assays K019).

    2. (b)

      cGMP (e.g., Arbor Assays K020).

2.3 In Vivo Intestinal Colonization and Vaccine Testing Materials

  1. 1.

    Strain with antibiotic resistance marker in a permissive location in the genome to facilitate counter selection.

    (jf876 which contains a kanamycin resistance cassette in the lacZYA locus that does not contribute to colonization) [13].

  2. 2.

    Mice (adult 5–8-week-old females, e.g., CD-1, Charles River), n = 20–30 (at a minimum ten mice will be needed for adjuvant-only controls and ten for adjuvant + vaccine antigen group comparisons).

  3. 3.

    Purified antigen (amount will depend on the route of vaccination).

  4. 4.

    Adjuvant appropriate for route of immunization.

  5. 5.

    Autoclaved food and bedding material.

  6. 6.

    Luria agar culture plates containing antibiotic (e.g., kanamycin (25 μg/ml)).

  7. 7.

    Water containing streptomycin 5 g/L.

  8. 8.

    Famotidine (sterile) for I.P. injection of mice.

  9. 9.

    Saponin 5 % sterile solution in PBS.

3 Methods

3.1 pTnphoA.ts Plasposon Transformation Steps

  1. 1.

    Grow DH10BT1 (pTnphoA.ts) overnight at 30 °C in Luria broth containing ampicillin, 100 μg/ml.

  2. 2.

    Isolate plasmid DNA.

  3. 3.

    Transform target strain by electroporation .

  4. 4.

    Select transformants overnight at 30 °C, on plates containing ampicillin 100 μg/ml.

  5. 5.

    Pool multiple transformants into single tube (e.g., Falcon 2059) containing 2 ml of Luria broth with ampicillin 100 μg/ml.

  6. 6.

    Grow overnight at 30 °C, and dilute 1:2 in glycerol freezing media.

  7. 7.

    Transformant mix can be preserved at −80 °C for future use.

3.2 Generation of TnphoA.ts Mutants

  1. 1.

    Grow transformant mixture or glycerol stock from above at 30 °C overnight in Luria broth containing ampicillin 100 μg/ml.

  2. 2.

    The following morning dilute 1:100 in fresh media (ampicillin 100 μg/ml) and grow with shaking (250 rpm) for 90 min at 30 °C.

  3. 3.

    Plate 100 μl of dilutions (~1:100) onto fresh gene fusion indicator plates and grow at 37 °C overnight.

  4. 4.

    Isolate individual blue colonies from indicator plates with sterile toothpicks and streak-purify onto fresh indicator agar.

  5. 5.

    Incubate at 37 °C overnight.

  6. 6.

    Select isolated, blue colonies for growth overnight in 2 ml of Luria broth containing ampicillin 100 μg/ml.

  7. 7.

    Use 1 ml of overnight culture for isolation of genomic DNA (follow the manufacturer’s protocol).

  8. 8.

    Preserve the remaining 1 ml of overnight culture as a frozen glycerol stock.

3.3 TnphoA.ts Mutant Screening

Because the Tn5-based transposition occurs largely at random resulting in the production of (at least) thousands of independent mutants, mutagenesis ideally can be coupled with a relatively high-throughput in vitro phenotypic assay. The results of these phenotypic screening assays can then be used to direct cloning and identification of insertion sites.

3.4 Identification of Transposon Insertion Sites

Because the transposition element contains a temperature-sensitive origin of replication in addition to the beta lactamase gene between the inverted repeats, recovery of DNA regions flanking the insertion is relatively straightforward.

  1. 1.

    Isolate total genomic DNA (plasmid and chromosomal DNA). (Most commercial genomic DNA preparation kits, e.g., Wizard® Genomic DNA Purification, Promega work well with ETEC strains.)

  2. 2.

    Digest an aliquot of genomic DNA with a restriction endonuclease that does not cut between the inverted repeats of the transposon (for instance, MluI).

  3. 3.

    Ligate the DNA with T4 DNA ligase.

  4. 4.

    Transform ligation mixture into commercially available ampicillin-sensitive E. coli cloning strain (e.g., DH10BT1, DH5α, Top10), selecting on Luria agar containing ampicillin, 100 μg/ml, at 30 °C.

  5. 5.

    Grow isolated ampicillin-resistant colonies in Luria broth overnight at 30 °C.

  6. 6.

    Isolate plasmid DNA using commercially available plasmid preparation kit that yields DNA of sufficient quality for sequencing.

  7. 7.

    Set up separate sequencing reactions with the primers TnphoA.179 (5′-CC ATCCCATCGCCAATCA-3′) and TnphoA.ts1 (5′-CGAAATTAATACGACTCA-3′).

  8. 8.

    Resulting DNA sequence information can then be used in BLASTN or BLASTX program searches of NCBI databases (http://blast.ncbi.nlm.nih.gov/Blast.cgi) to identify potential homologues.

3.5 De Novo Identification of Vaccine Antigens from Whole-Genome Sequences

The cost and time required for sequencing and assembling entire bacterial genomes have declined dramatically since the genome of Haemophilus influenza was first assembled now over 20 years ago [30]. This has permitted both the de novo identification of candidate antigens and a recent assessment of the conservation of known antigens and putative vaccine targets in multiple ETEC strains from different geographic regions isolated over time [31].

One approach to the identification of candidate vaccine antigens is the “in silico” interrogation of genome sequence data using algorithms or programs that select molecules which have at least a theoretical likelihood of being exposed on the surface of the organism or secreted. These exposed or surface-expressed molecules are at least in principle amenable to neutralization by vaccination. Complex multifactorial investigations involving multiple genomes require training and experience in bioinformatics ; however individual genomes can be interrogated using fairly simple Web-based interfaces.

In addition to identification of surface molecules, the approach to characterization of potential novel ETEC vaccine antigens involves an assessment of the degree to which these antigens are unique to ETEC genomes and to which they are shared with commensal strains. By definition, all ETEC strains make at least one of the known toxins (LT, STh, and/or STp). However, there is no other universally shared antigen that is common to all ETEC that is not also represented in the rest of E. coli including the nonpathogenic commensal strains that make up a small portion of the microbiome of most humans [32]. The challenge therefore lies in defining appropriate vaccine targets among the population of antigens that are uniquely pathovar/ETEC associated.

  1. 1.

    General approaches to defining potential vaccine antigens from genome data.

    Traditional identification of candidate antigens has relied on empirical microbial pathogenesis studies or genetic techniques outlined above to define surface features that could be exploited in vaccines. However, with the advent of high-throughput whole-genome sequencing, it is now possible to use “reverse vaccinology ” [33, 34] to identify candidate antigens by in silico interrogation of data from multiple ETEC genomes. To some extent, draft or complete genomes can be interrogated by those without specific training in informatics to identify putative vaccine antigens with publicly available Web-based platforms (Table 1). However, application of these and other algorithms on a broader scale, that potentially involves hundreds of genomes, will certainly require more extensive bioinformatic capabilities. Both approaches follow the same general scheme outlined in Fig. 3.

    Table 1 Bioinformatic links applicable to ETEC reverse vaccinology
    Fig. 3
    figure 3

    Algorithms useful in identification and in silico characterization of candidate vaccine antigens

  2. 2.

    Identification of pathovar-specific features.

    The scheme for in silico identification of potential candidates essentially involves two main tasks. The first is to identify features of ETEC genomes that are relatively pathovar specific, but which are not shared with commensal E. coli strains. A potential pitfall of this analysis is that there are at present a very limited number of true commensal isolates from healthy humans for which DNA sequence data are available (Table 2). Nevertheless, the many E. coli genes that provide for essential metabolic functions and core structural elements of these organisms can be digitally subtracted from pathogen genomes. Ideally, candidates would be shared broadly among a diverse population of ETEC. To this end, genome data from several hundred ETEC strains are presently available [11, 31, 35] permitting pan-ETEC genome comparative analyses to identify features that are relatively conserved in this pathovar.

    Table 2 Commensal E. coli strains with sequenced genomes
  3. 3.

    Identification of putative vaccine antigens from genome-subtracted data.

    Following the identification of conserved, pathovar-specific features, the next major task is to identify those antigens that are potentially surface expressed, and/or which share features in common with known vaccine antigens using the algorithms outlined in Table 1 and Fig. 3. These approaches are complementary. The inclusion of molecules with motifs or domains conserved in other vaccine antigens permits the investigator to capture putative antigens where surface expression may not be obvious, or alternatively to prioritize antigens that share features with effective vaccine targets.

3.6 Preclinical Antigen Validation In Vivo

3.6.1 General Approach to In Vivo Studies

One of the problems facing investigators hoping to develop effective enteric vaccines is the lack of a small animal model that faithfully recapitulates the nature of the illness in humans. Mice do not develop diarrhea with any of the common enteric pathogens that infect humans, including enterotoxigenic E. coli, even at high doses that typically cause serious illness in volunteers. Nevertheless, mice do become colonized with ETEC following oral (gavage) challenge with inocula as small as 103–104 colony-forming units, permitting a straightforward assessment of the impact of vaccination with candidate antigens on intestinal colonization, a critical step in pathogenesis [36]. Studies to date in this model have revealed that colonization of the small intestine is really a very complex phenotype involving a variety of different virulence factors in addition to the fimbrial structures that have been the traditional targets for ETEC vaccines [3740], thereby affording additional approaches to vaccine development to overcome the limitations of CF-based vaccines. The basic protocol for mouse vaccination studies follows:

  1. 1.

    (~Day 7) Acquire and acclimate mice (at least 1 week prior to experimentation).

  2. 2.

    (Day 0) Vaccinate mice (n ≥ 10) with adjuvant and antigen (dose will depend on adjuvant , and route of administration). Vaccinate an equal number with adjuvant-only as control group.

  3. 3.

    (Day 14) Administer first booster vaccination.

  4. 4.

    (Day 28) Administer second booster vaccination.

  5. 5.

    (Day 40) Add streptomycin (5 g/L) to drinking water.

  6. 6.

    (Day 41) Remove streptomycin, and return to regular drinking water. Grow challenge strain overnight.

  7. 7.

    (Day 42) Challenge with ~104–105 colony-forming units of ETEC by gavage. Plate dilutions of inoculum onto selective media.

  8. 8.

    (Day 43) Sacrifice mice, harvest segments of small intestine in saponin, and plate undiluted, 10−1, 10−2, dilutions onto selective media.

  9. 9.

    Determine cfu/mouse in vaccinated and control groups.

3.7 Preclinical Antigen Studies In Vitro

3.7.1 Bacterial Adhesion Assays

Effective interaction with intestinal epithelial cells is a key event in ETEC pathogenesis. Bacterial adhesion assays in which bacteria are added to intestinal epithelial cells cultured in vitro have become a mainstay of ETEC molecular pathogenesis investigations [29, 37, 4144]. Despite the simplicity of these assays, in which bacteria remaining attached to epithelial cells are quantitated after a finite period of incubation, they have been instrumental in characterizing the role of a number of essential virulence factors, including several different adhesins. Likewise, they have been used in a number of preclinical vaccinology studies, where antibodies raised against specific surface antigens are tested for their ability to mitigate bacterial-host interactions [39, 45, 46]. The basic ETEC adhesion assay follows:

  1. 1.

    Plate target epithelial cell line (e.g., Caco-2) into 96-well tissue culture-treated plates.

  2. 2.

    Incubate at 37 °C, 5 % CO2 to establish confluent monolayers.

  3. 3.

    Inoculate 2 ml of LB media in 15 ml round-bottom tube with frozen glycerol stock of the ETEC testing strain (e.g., H10407).

  4. 4.

    Incubate overnight at 37 °C, 200 rpm.

  5. 5.

    Dilute 1:100 into 2 ml of fresh LB; grow for ~90′ to mid-logarithmic phase growth.

  6. 6.

    Immediately prior to addition of bacteria, add antibody against target antigen (s).

  7. 7.

    Inoculate tissue culture wells with 1–2 μl of bacteria per well.

  8. 8.

    Return plate to tissue culture incubator for 1 h.

  9. 9.

    During incubation, plate dilutions of inoculum onto Luria agar.

  10. 10.

    After 1 h, remove plate from tissue culture incubator and wash 4–5× with HBSS, 100 μl/well.

  11. 11.

    Lyse epithelial cells with 0.1 % Triton-x-100 for 5 min.

  12. 12.

    Plate dilutions of lysate in PBS onto Luria agar.

  13. 13.

    The following day count inoculum and output colonies.

  14. 14.

    Express results as % cell-associated bacteria (recovered cfu/input cfu × 100).

3.7.2 Toxin Delivery Assays

ETEC delivery of heat-labile and/or heat-stable enterotoxins, which, respectively, activate cAMP and cGMP production in target intestinal epithelial cells, is the sine qua nonvirulence feature that defines this pathovar. Therefore, much can be learned in detailed investigation of the molecular events that culminate in bacterial activation of these cyclic nucleotides. In vitro studies using intestinal epithelial cell lines (Table 3) can be used to investigate the efficiency with which mutant strains lacking candidate virulence features deliver LT and/or ST enterotoxin payloads. Consequently, cAMP and cGMP assays also provide a convenient surrogate marker to gauge the effectiveness of antibodies to individual candidate antigens or to a combination of targets [46] in abrogating toxin delivery. A basic protocol for assessing delivery of the respective toxins follows:

  1. 1.

    Plate target epithelial cell line (e.g., Caco-2) into 96-well tissue culture-treated plates.

  2. 2.

    Incubate at 37 °C, 5 % CO2, to establish confluent monolayers.

  3. 3.

    Inoculate 2 ml of LB media in 15 ml round-bottom tube with frozen glycerol stock of the ETEC testing strain (e.g., H10407).

  4. 4.

    Incubate overnight at 37 °C, 200 rpm.

  5. 5.

    Dilute 1:100 into 2 ml of fresh LB; grow for ~90′ to mid-logarithmic phase growth.

  6. 6.

    Immediately prior to addition of bacteria, add antibody against target antigen (s).

  7. 7.

    Inoculate tissue culture wells with 1–2 μl of bacteria per well.

  8. 8.

    Return plate to tissue culture incubator for ~3 h.

  9. 9.

    Wash three times with pre-warmed tissue culture media.

  10. 10.

    Return to incubator for an additional 2 h.

  11. 11.

    Wash gently with HBSS.

  12. 12.

    Process cells for cyclic nucleotide quantitation following instructions provided in assay.

Table 3 Gastrointestinal cell lines used to examine ETEC pathogen -host interactions

4 Notes

  1. 1.

    TnphoA.ts mutagenesis notes

    1. (a)

      While resident plasmids of ETEC will co-purify with chromosomal DNA in commercial genomic DNA isolation kits, this is not true of commercial plasmid preparation kits. ETEC plasmids are often quite large and do not purify easily with most commercial plasmid kits.

    2. (b)

      Re-ligation will generally favor intramolecular ligation; therefore the resulting plasmid should have a single unique restriction site joining the flanking regions. This can be confirmed by restriction endonuclease digestion.

  2. 2.

    Sample strategy: Identification of EtpA as a vaccine antigen .

    The investigation of EtpA as a candidate vaccine antigen to date has involved many of the strategies outlined above. Therefore, in the following section, we use EtpA to illustrate the application of the different bioinformatic algorithms to vaccine candidate selection.

    1. (a)

      Identification of conserved, pathovar-specific antigens

      • Comparison of sequence of the large plasmid of ETEC H10407 (p948) (http://www.ncbi.nlm.nih.gov/nuccore/FN649418.1) with the sequenced genomes of HS, SE11, SE15, and E. coli K-12 (MG1655) (e.g., using the singleton method in EDGAR) yields a list of only 33 candidate genes that are not found in any of the commensals or E. coli K-12. Included in this list are genes involved in synthesis of the colonization factor antigen CFA/I, the EtpBAC operon, and the EatA serine protease autotransporter.

      • BLAST-P (http://blast.ncbi.nlm.nih.gov) analysis of EtpA peptide sequence (GenBank accession number AAX13509) identifies many close homologues of EtpA. However, these are strictly confined to ETEC strains that have been sequenced to date. In fact, recent studies suggest that this antigen is relatively conserved in the ETEC pathovar [31]. Conversely, BLAST-P against commensals HS, SE11, SE15, Nissle 1917, or the laboratory isolate MG1655 fails to reveal any significant homology, further suggesting that this particular protein is “pathovar specific.”

    2. (b)

      Examination of potential surface expression: Analysis of the EtpA peptide sequence with each of the cell localization and domain characterization algorithms outlined in Table 1 yields results provided in Table 4. Without prior knowledge of EtpA function, these algorithms would have (correctly) predicted that this protein shares a number of features with filamentous hemagglutinin (FHA), a component of the acellular pertussis vaccine and that similar to FHA it belongs to the two-partner family of secretion molecules, which feature atypical extended signal peptides. Like FHA, EtpA is correctly predicted to function as an extracellular adhesin [47] molecule. Collectively, these results suggest that when applied to ETEC , reverse vaccinology approaches have the potential to select novel candidate antigens for downstream validation in vitro and ultimately for vaccine testing in vivo.

      Table 4 EtpA as a prototype molecule in reverse vaccinology algorithms
  3. 3.

    Refining antigen selection.

    1. (a)

      Using immunoproteomics to narrow antigen selection: While the bioinformatics approaches above can offer a list of candidates, the list may be extensive, and additional criteria will likely be needed to refine selection of molecules for further testing as vaccine candidates. Therefore a number of additional modalities have recently been used to highlight key antigens that can be exploited in a vaccine. Previous efforts have combined an examination of the proteome of ETEC with immune response s generated during experimental infection of animals or natural infections in humans to identify novel antigens [48]. A number of antigens that are not currently targeted in ETEC vaccine approaches were identified including the secreted proteins EtpA, EatA, and YghJ, and antigen 43, an autotransporter protein. To date, three of these proteins EtpA, EatA, and antigen 43 have been shown to offer protection against ETEC infection in an animal model [40, 49, 50]. Nevertheless, this approach has a number of very important limitations: (1) it requires fractionation of bacterial samples to separate proteins which are secreted or which localized to the outer membrane; (2) only one strain can be examined at a time; (3) laboratory culture conditions may not reflect those in vivo, impeding identification of proteins which are not optimally expressed or present in low abundance; (4) it is laborious requiring 2D separation of proteins, subsequent identification of immunoreactive spots by immunoblotting, and extraction of the corresponding protein from a parallel sample which is then identified using mass spectrometry.

    2. (b)

      Protein microarrays: It is now possible to overcome many of the limitations inherent in the approach outlined above through the use of protein microarrays [51]. These arrays, which have been applied in the investigation of immune responses to diverse pathogens [5255], offer a number of theoretical advantages. (1) First, they can incorporate features from a number of isolates, thereby coupling bioinformatic analysis of many strains to the printing of key conserved antigens onto the array. (2) It is now possible to synthesize sufficient protein by in vitro transcription-translation to accomplish high-throughput antigen synthesis needed to construct hundreds of arrays. (3) The relatively small format of these arrays greatly reduces the sample volumes required for analysis of thousands of candidate antigens simultaneously. Several unpublished projects using ETEC-specific protein microarrays show significant promise in profiling protective immune response s to candidate ETEC vaccines, and in assessing responses that follow natural infections. While experimental and natural ETEC infections offer protection against subsequent disease, the mechanistic correlates of protection have not been established. Protein microarrays potentially afford an unbiased approach to finding immunologic signatures associated with protection that can then be mined to prioritize antigens for subsequent vaccine testing.

  4. 4.

    Summary.

    Since the discovery of ETEC now more than 40 years ago in individuals with severe cholera-like diarrheal illness [56, 57], vaccine development efforts have largely focused on a small group of plasmid-encoded antigens, namely the colonization factors (CFs). With time, investigators have gained an increased appreciation for the complex valency requirements for vaccines based exclusively on CFs [9, 23] and/or heat-labile toxin [58], stimulating the discovery of additional antigens that could complement existing approaches. A compilation of more than 100 sequenced ETEC genomes provides a very rich dataset to interrogate in pursuit of additional vaccine targets.