Background

Pneumococci (Streptococcus pneumoniae) are a major bacterial cause of disease for which antimicrobial resistance is of increasing concern. A recent report estimated that in 2009 >24% of disease-causing pneumococci in the USA were tetracycline-nonsusceptible and 8.2% were chloramphenicol-nonsusceptible [1]. In some European countries, such as Spain and France, these rates were higher: 35.1% and 42.3% of pneumococci isolated in these countries, respectively, were tetracycline-nonsusceptible, while 26.1% and 20.3%, respectively were chloramphenicol-nonsusceptible [2]. Similarly, 39.2% of pneumococci in the USA [1], 26.7% of those in Spain and 30.0% of those in France were macrolide-nonsusceptible [2]. In some regions of Asia rates were even higher: 84.8%, 66.8% and 96.4% of pneumococcal isolates were tetracycline-, chloramphenicol- and erythromycin-nonsusceptible, respectively [3].

Among pneumococci, chloramphenicol and tetracycline nonsusceptibility are conferred through acquisition of the cat and tet genes (most commonly tet(M)). The cat gene encodes a chloramphenicol acetyltransferase, which catalyses the conversion of chloramphenicol into non-active derivatives. tet(M) encodes a ribosomal protection protein shown to initiate reversal of tetracycline-ribosome binding, an event which would otherwise inhibit protein synthesis [4].

Pneumococcal nonsusceptibility to erythromycin and other macrolides is most commonly conferred by acquisition of the erm(B) and/or mef(A/E) genes [5, 6]. erm(B) encodes an N-methyltransferase which mediates macrolide-target site alteration [7]. mef(A/E) encodes macrolide efflux pumps [8] allowing the resistant cell to reduce its internal macrolide concentration and thereby eliminate the detrimental effects of these drugs.

In recent years it has become apparent that the majority of tet(M) and/or cat genes among tetracycline- and/or chloramphenicol-nonsusceptible pneumococci are contained within genetic elements known as integrative conjugative elements (ICEs) [9]. tet(M) is associated with the Tn916 family of ICEs, the first of which was described in 1981 from an Enterococcus faecalis isolate [10]. cat is associated with the Tn5252 ICE family, first described in 1991 [11], where, besides the typical intTn5252 integrase, a different integrase (intICESp 23FST81) has recently been described [12].

To date, Tn916-like ICEs have been identified among >35 bacterial genera [13]. Tn916 itself is ~18 kb in length and contains 24 genes arranged into functional modules associated with conjugal transfer, recombination, transcriptional regulation and accessory functions (e.g. tetracycline resistance [14]). The recombination module contains the integrase (int) and excisionase (xis) genes. The int gene encodes a tyrosine recombinase responsible for integration and excision of the element from the host chromosome. The product of xis drives the directionality of the recombination process, most commonly by promoting excision [14]. Tn916 has been shown to have the ability to transpose intracellularly and to mediate its own intercellular transfer via conjugation [14]. Successful conjugal transfer between pneumococci and into pneumococci from other species is a strain-dependent property [15, 16]; transfer between pneumococci by transformation has also been documented [17]. Aside from Tn916, other members of this family may contain alternate accessory genes and/or feature a smaller, independent genetic element integrated within the Tn916 genetic background, e.g. the erm(B)-containing Tn917 or Omega elements, or the mef(E)-containing MEGA (macrolide efflux genetic assembly) element [1719].

Among pneumococci, Tn5252-like ICEs are most commonly identified in association with Tn916-like ICEs [9, 11, 12, 18] whereby the resulting composite element is known as Tn5253 (or Tn5253-like) and may be highly variable [9, 18]. In addition to the cat gene, Tn5252 (~48 kb in length) contains an independent int gene and the umuDC gene, shown to provide protection from UV damage [20]. In some cases Tn5252-like ICEs also contain a lantibiotic synthesis gene cluster [12, 18]. Lantibiotics, or lanthionine-containing antibiotics, are antimicrobial compounds with activity against a range of gram-positive bacterial species [21].

There is evidence to suggest that Tn916-like and Tn5252-like elements were present among pneumococci in the 1970s [22, 23] and to our knowledge the oldest pneumococcus for which such an element has been described was isolated in 1974 [11, 22]. However, tetracycline-nonsusceptible pneumococci were first identified in 1962 [24], while pneumococcal resistance to erythromycin and chloramphenicol were recognised in 1967 [25] and 1970 [26], respectively. Given the important role of these ICEs in the dissemination of resistance determinants, it therefore seems likely that they, or similar genetic elements, were present among pneumococcal populations from at least as early as the 1960s. The aim of this study was to use our historical genome collection [27] to search for, characterise, and compare such older elements among pneumococci isolated prior to 1974, thus pre-dating the earliest known pneumococcal Tn916 and Tn5252 representatives [10, 11, 22].

Results

Nucleotide comparison of pneumococcal ICE int genes

Ten uniquely designated pneumococcal ICEs were identified from Genbank. Four elements (Tn1131, Tn5253, ICESpn 11876 and ICESpn 11930) each contained two independent int genes. Six elements each contained only a single int gene. Alignment of all 14 int nucleotide sequences showed four clearly defined clusters; sequences assigned to the same cluster shared >98% nucleotide identity (Table 1).

Table 1 Integrase genes identified in specific pneumococcal genetic elements, grouped by nucleotide sequence similarity

Identification of resistance determinants and transposon int genes among pneumococci isolated prior to 1974

In total, 19 unique CCs and 23 unique serotypes/groups were represented by the 38 isolates included in this study (Table 2). The BIGS database BLASTn tool [28] was used to screen the genomes of these isolates for the presence of each of the tet(M), cat, erm(B) and mef(A/E) resistance determinants, and one representative of each of the int nucleotide sequence clusters. Two isolates, 14/5 (1967) and 18C/3 (1968), were positive for both intTn916 and tet(M). Isolate PN1 (1972) did not possess tet(M), cat, erm(B) or mef(A/E) resistance determinant genes, although it had intTn5252. No other isolates were positive for any of the tet(M), cat, erm(B) or mef(A/E) resistance determinant genes, or any int genes.

Table 2 Pneumococcal isolates included in this study

Within the 14/5 genome, the intTn916 and tet(M) genes were located on a Tn916 element which was structurally identical (i.e. contained the same putative genes in the same order and was a complete BLASTn match) to the Tn916 region of ICESp 23FST81 (the PMEN1 pneumococcal reference isolate ICE, Figure 1). In the 14/5 genome this Tn916 ICE was inserted upstream of the pspA locus, which encodes pneumococcal surface protein A.

Figure 1
figure 1

Comparison of mobile genetic elements identified in this study and those previously described. Elements are named as in text, and isolates in which these elements were identified are named in parentheses. Predicted genes are depicted by horizontal arrows; cyan arrows represent tet(M) resistance genes, the orange arrow indicates cat, and purple arrows represent lantibiotic-associated genes. Red bars represent BLASTn matches between sequences. Blue bars represent reverse-oriented BLASTn matches. The region of missing sequence within the Tn916-(bacterio)phage composite element is indicated by a vertical black arrow; the asterisk marks an ~11.3 kb insertion in ICESp PN1 (as described in Results). See Additional file 1 for details about the genes found within the Tn916-phage composite and ICESp PN1 elements.

Within the 18C/3 genome, intTn916 and tet(M) were located on a Tn916 element, itself associated with a putative bacteriophage showing similarity to the Streptococcus phage 040922 (Genbank accession no. FR671406.1; Figure 1) and inserted in the genome downstream of the trxB locus, which encodes a pyridine nucleotide-disulphide oxidoreductase. The genome of the 18C/3 bacteriophage contained 48 genes, 31 of which were predicted to encode hypothetical proteins. A further 13 genes were predicted to encode functional phage proteins such as the phage integrase, lytic amidase, holin and structural phage proteins (Additional file 1; Genbank accession no. KC488256). It should be noted here that contiguous sequence could not be obtained across the full length of the 18C/3 Tn916-bacteriophage composite element, despite repeated attempts using PCR and conventional Sanger sequencing techniques. The position of the missing region is shown in Figure 1 and likely included part of the pblB gene, which may play a role in adhesion [29]. Relative to those of 14/5 and ICESp 23FST81, the 18C/3 Tn916-like sequence contained a 155 bp deletion between orf12 and tet(M).

Comparative analyses indicated that PN1 harboured a novel, composite ICE which was inserted in the genome upstream of the rbgA locus, which encodes a ribosomal biogenesis GTPase. This novel ICE, designated ICESp PN1, contained 45 genes and regions of similarity to Tn5252 and PPI-1 (Pneumococcal Pathogenicity Island 1) as described below (Figure 1 and Additional file 1; Genbank accession no. KC488257).

The 15 5’-most genes of ICESp PN1 showed similarity to those of the Tn5252- like region of ICESp 23FST81 and included repA, which encodes replication initiation factor A. A further four and two putative genes also showed similarity to those of Tn5252 and were separated by a ~11.3 kb insertion (Figure 1). This insertion contained 10 putative genes, including four with ≥96% nucleotide sequence identity to those of a two-component signalling system described for Streptococcus mitis strain B6 [30] and S. pneumoniae ICESpn 8140 [18]. These genes were predicted to encode an ABC-type antimicrobial peptide transporter, an ABC-type transporter permease, a sensor histidine kinase and a response regulator. Directly 3’ of the two-component system cluster were two genes for which the predicted peptide products showed 64% amino acid identity to the LanM lantibiotic and 76% amino acid identity to a lantibiotic transporter, respectively.

The 3’-most ICESp PN1 gene represented intTn5252 and was preceded by a putative excisionase gene, predicted to encode a protein with 83% amino acid identity to a putative Streptococcus sp. excisionase. Directly upstream of the putative excisionase gene was a region which was structurally identical to part of PPI-1 of TIGR4, another pneumococcal reference isolate (Genbank accession no. NC_003028.3, see Figure 1). This region spanned nine predicted genes, the predicted protein products of which included the PlcR putative transcriptional regulator, an IS200 family transposase, a putative protein kinase, and both a putative ABC-transporter ATP-binding protein and permease.

Comparison of Tn916 sequence regions

The 14/5 Tn916- like nucleotide sequence differed from that of ICESp 23FST81 by only 101 substitutions and a 5 bp insertion within orf14. 94 of the nucleotide differences between these sequences were located within the tet(M) gene, which is consistent with the previous observation that tet(M) possesses a mosaic nucleotide sequence [31]. Additionally, this Tn916-like sequence shared >99% nucleotide identity with Tn916-like ICEs identified among contemporary bacterial isolates including Streptococcus suis, Streptococcus parauberis and Staphylococcus aureus[3234]. The 18C/3 Tn916-like nucleotide sequence was highly similar to that of the 14/5, differing by only two nucleotide substitutions (and the deletion described above).

Discussion

This study of an historical collection of pneumococci has led to the discovery of two of the earliest known representatives of pneumococcal Tn916- like ICEs and a novel composite ICE. The Tn916-like ICEs from pneumococci isolated in 1967 and 1968, respectively, were highly similar to the Tn916-like region of ICESp 23FST81 from the PMEN1 reference strain dated 1984 [12], and contemporary Tn916-like ICEs. Both of these ICEs contained the tetracycline-resistance determinant tet(M), but did not contain any other resistance determinant genes. Identification of such elements among pneumococci isolated in the late 1960s confirms their existence from the first decade within which tetracycline resistance was reported among pneumococci [24]. (Note that tetracycline was released for use in 1948, chloramphenicol in 1947 and erythromycin in 1952 [35].)

The Tn916-like element dated 1968 in isolate 18C/3 was inserted within a phage showing similarity to the Streptococcus phage 040922 (Genbank accession no. FR671406). Bacteriophages are known to mediate horizontal gene transfer between bacteria through the process of transduction [36], although it is not clear to what extent this process occurs amongst pneumococci, or whether the 040922 phage has retained this ability. Pneumococcal Tn916- like ICEs have not generally been associated with phage [9, 16, 17, 19] and no such other phage-associated representatives are present in the Genbank database. Thus it seems unlikely that phage-mediated transduction has played an important role in the dissemination of Tn916-like ICEs, although it is possible that the 18C/3 ICE was acquired in this way.

Aside from the tet(M) genes described above, no other tet(M), cat, erm(B) or mef(A/E) resistance determinants were identified among the pneumococcal genomes studied here. There are other resistance mechanisms such as erm(A), erm(TR) or tet(O) that have been found in pneumococci but are believed to be rare [37, 38]. We searched for these genes among our genome collection and found no evidence for them, thus in this study we focussed on the common tetracycline, chloramphenicol and erythromycin resistance mechanisms among pneumococci.

In addition to those described above, an ICE-associated int gene was identified among one further isolate, PN1, dated 1972. It is also worth noting that this strain was among the earliest penicillin-nonsusceptible pneumococci to be identified, by virtue of the possession of altered penicillin-binding protein genes [27]. Further analyses indicated that this pneumococcus harboured a novel composite ICE, designated ICESp PN1, which contained regions of similarity to the previously described Tn5252-like ICEs and PPI-1 of the TIGR4 reference isolate. Additionally this ICE contained a two-component signalling system gene cluster and two putative lantibiotic biosynthesis-/export- associated genes.

Lantibiotics are small, lanthionine-containing antimicrobial peptides. Synthesis and secretion follow detection of extracellular signals and are auto-regulated by two-component signal systems such as that putatively described above for ICESp PN1. Antimicrobial activity is achieved following post-translational modification. Production of immunity proteins is required to protect the host cell [21]; however, no such post-translational modification enzyme- or immunity protein- associated genes were identified within ICESp PN1, calling into question the ability of PN1 to produce a functional version of this lantibiotic.

PPI-1 contains a 5’ region which is highly conserved among pneumococci and a 3’ region which is not conserved (with the exception of the most 3’ gene) [12, 39]. ICESp PN1 contained a region which was structurally identical to the non-conserved region of TIGR4 PPI-1. Previously it was noted that loci within the conserved regions of PPI-1 showed similarity to those of Tn5252[39]. Subsequently it was suggested that sequences may be exchanged between these elements via homologous recombination [12]. Such a process could explain the patterns of shared sequence structure described here.

Previous authors have demonstrated the capacity for diversity among pneumococcal ICEs [9, 18], and the description of ICESp PN1 reiterates this dynamic variability. The finding that two pneumococci isolated in the late 1960s harboured Tn916-like ICEs, and that these ICEs were highly similar to that of the epidemiologically-successful PMEN1 reference strain and contemporary Tn916-like elements, demonstrates the ability of these ICEs to persist, almost unchanged (≥99% nucleotide sequence identity), within bacterial populations for many years. The phenotypic effects associated with possession of such ICEs have an important impact on our ability to treat pneumococcal disease. In this context, understanding the processes driving the spread, maintenance and/or diversification of these elements is of the utmost importance.

Conclusions

In this study we discovered the oldest known examples of tetracycline resistance-conferring pneumococcal genetic elements: two different Tn916-like, tet(M)-containing, elements identified among pneumococci dated 1967 and 1968. The former element was highly similar to that of the PMEN1 multidrug-resistant, globally-distributed pneumococcal reference strain isolated in 1984. The latter element was uniquely associated with a streptococcal phage. We also described a novel ICE element in a pneumococcal isolate recovered in Papua New Guinea in 1972, and interestingly, this isolate was also one of the earliest penicillin-nonsusceptible pneumococci. This novel element, designated ICESp PN1, contained a region of similarity to Tn5252, a region of similarity to a pneumococcal pathogenicity island and novel lantibiotic synthesis/export-associated genes. The importance of antimicrobial resistance among pneumococci is unequivocal, and our work sheds further light on how these particular resistance determinants have evolved.

Methods

Genomic sequencing, serotype and genotype data

Whole-genome sequence data for pneumococci sampled from an historical isolate collection were previously generated and described [27]. Thirty-six of these genome sequences represented pneumococci isolated prior to 1974 and were thus included in this study (Table 2). The genomes of two additional pneumococci, PN2 and PN1, isolated in 1969 and 1972 respectively (Table 2), had recently been added to our historical isolate collection and were included because they were also isolated before 1974. These genome sequences were generated on the Illumina Hi-seq platform; production of 200 bp insert libraries was followed by 100 nucleotide paired-end sequencing using standard protocols. Illumina reads were assembled to consensus contigs using Velvet [40]. Data were deposited in a BIGS database [28]. Serotype/group and genotype data (as defined by multilocus sequence typing [41]) for all isolates were previously described. Isolates were assigned to clonal complexes (CCs – clusters of genotypes descended from a recent common ancestor) by a modified goeBURST method [27, 42].

Identification of putative ICEs

Reference nucleotide sequences for each of the tet(M), cat, erm(B) and mef(A/E) resistance determinants were retrieved from Genbank. int gene sequences were retrieved from each uniquely designated pneumococcal ICE, identified by interrogation of Genbank using the following search term combinations: Organism = Streptococcus pneumoniae + Title = Transposon, Organism = Streptococcus pneumoniae + Title = Conjug*, Organism = Streptococcus pneumoniae + Title = element. int nucleotide sequences were aligned by MUSCLE [43] and imported to MEGA5 [44] for assignment to clusters (by visual comparison), and sequence identity calculation. The BIGS database BLASTn tool [28] was used to search isolate genomes for the tet(M), cat, erm(B) and mef(A/E) resistance determinants, and a single representative of each int gene cluster.

Isolates positive for any resistance determinant or int gene were further studied by extraction of the relevant Velvet contig from the BIGS database and comparison to the genomes of the pneumococcal reference strains R6 and PMEN1 (Genbank accession no. AE007317.1 and NC_011900.1, respectively) using the Artemis Comparison Tool (ACT) [45]. Candidate genetic elements were identified as those which contained one or more of the resistance determinants of interest (i.e. tet(M), cat, erm(B) and mef(A/E)) and/or int genes, and were not similar to any region of the R6 genome, which does not contain any Tn916/Tn5252-like ICEs. Candidate Tn916-like and/or Tn5252-like elements were further identified as those which showed similarity to the Tn916 and/or Tn5252-like regions of the PMEN1 reference ICE, ICESp 23FST81 [12]. Regions of the resistant determinant/int-containing contigs that did not show similarity to any part of the R6 genome or PMEN1 ICE were extracted and queried against the Genbank database by BLASTn. Sequences representing the best BLAST matches were retrieved from the database for further comparison using ACT.

Identification of additional ICE regions and genome integration sites

When a single Velvet contig contained putative ICE sequences plus regions showing similarity to the R6 reference genome, the latter were considered to represent the genomic flanking regions of the ICE. Consequently putative ICE genomic integration sites were identified by reference to the R6 genome annotation.

When genomic flanking sequences were not contiguous to the putative ICE sequences, SMALT [46] was used to map Illumina sequence reads to the Velvet consensus assembly. Mapped assemblies were converted to a gap5 database [47]. Illumina reads that mapped to the ends of the putative ICE-containing contigs were checked for the location of their corresponding paired reads. This enabled the identification of consensus assembly contigs representing the adjacent region(s) of the genome. Comparison of these contigs to the R6 genome using ACT identified the putative ICE integration sites. Sequences which were not similar to the R6 genome were considered additional ICE regions and were queried against the Genbank database by BLAST.

Association of the original ICE assembly contigs, additional putative ICE contigs and/or the putative genomic flanking region contigs was confirmed by standard PCR amplification in 25 μl reaction volumes followed by agarose gel electrophoresis (primers available upon request). Conventional Sanger sequencing was used to close sequence gaps within putative ICEs. PCR products to be used for Sanger sequencing were precipitated with 60 μl of 20% PEG (polyethylene glycolate) / 2.5 M NaCl and washed with 70% ETOH. Sequencing was completed as described previously [27].

Prediction of genes and Tn916 sequence comparison

Putative genes were predicted using Prodigal [48]. Where possible, putative functions were assigned by BLAST match to sequences deposited in Genbank. Tn916 sequence regions were aligned by MUSCLE [43] and imported to MEGA5 [44] for visual inspection / sequence identity calculation.