Introduction

The genus Achromobacter (β-Proteobacteria) was first established in 1981 when it was separated from the Alcaligenes (Brenner et al. 2005). Members of the Achromobacter genus are distributed in soil, water, and marine environments. Furthermore, some isolates are common inhabitants of the human intestinal tract and become the opportunistic pathogens for immunocompromised patients (e.g., causing bacteremia, meningitis, and urinary tract infection) (Behrens-Muller et al. 2012; Gomez-Cerezo et al. 2003; Mandell et al. 1987). Phylogenetically, Achromobacter is closely related to the genera Alcaligenes and Bordetella, of which most members are human/animal pathogens (Gross et al. 2008; Mattoo and Cherry 2005).

Much attention has been paid to Achromobacter because it is an emerging nosocomial pathogen, increasingly isolated from patients with cystic fibrosis (CF) (Amoureux et al. 2012; Pereira et al. 2011; Ronne Hansen et al. 2006; Spilker et al. 2012a). CF is an autosomal recessive genetic disease causing a chronic infection of the respiratory tract and ultimately leading to progressive respiratory deficiency. Many bacteria are related to respiratory tract infection in CF patients, such as Staphylococcus aureus, Haemophilus influenzae, and the opportunistic pathogens, Pseudomonas aeruginosa (Lipuma 2010). To date, although there has been no definite evidence of infection for Achromobacter strains, clinical researches have frequently reported infection in CF patients with isolates following this genus, suggesting a significant relationship between Achromobacter spp. and CF infection. Achromobacter xylosoxidans was the predominantly reported isolate among the CF clinical isolates of Achromobacter. Recently, Achromobacter ruhlandii and several other putative novel Achromobacter species were also found in CF patients using multilocus sequence typing (MLST) (Ridderberg et al. 2012; Spilker et al. 2012b). In addition, it was suggested that there was no distinct discrimination of the population structure between CF and non-CF source isolates based on MLST markers (Spilker et al. 2012b). The Achromobacter strain exhibited the property of the intrinsic multidrug resistance that was a crucial challenge for handling infection with this potential pathogen (Amoureux et al. 2012). According to recent studies, the resistance–nodulation–cell division-type efflux system and several β-lactamases appeared to be responsible for the resistance mechanisms (Bador et al. 2013, 2011; El Salabi et al. 2012; Yamamoto et al. 2012).

In addition to clinical attention, numerous strains of this genus were isolated from various environments. The Achromobacter strains appear to display a wide diversity of metabolisms, and some members even showed potential applications in biotechnological processes. For example, some of these applications include aromatic compound degradation (A. xylosoxidans A8) (Jencova et al. 2008; Strnad et al. 2011), arsenic resistance, and transformation (Achromobacter arsenitoxydans SY8; Achromobacter piechaudii HLE) (Cai et al. 2009; Li et al. 2012; Osborne and Enrlich 1976), and production of enoate reductase for asymmetric bioreduction of activated alkenes (Achromobacter sp. JA81) (Liu et al. 2012).

Very little is known regarding the genetic characteristics of Achromobacter and the potential mechanisms causing disease. In recent years, six Achromobacter genomes have been sequenced, including both CF-source and environment-source isolates (listed in Table 1), which represented good candidates to explore unknown virulence factors associated with pathogenicity in Achromobacter. In the present study, a comparative genome analysis was performed for the first time to investigate issues associated with the genetic diversity, phylogenetics and evolution, and the genomic differences between CF-clinical and environmental bacteria of Achromobacter.

Table 1 General features of the six Achromobacter genomes

Materials and methods

Genome sequence and re-annotation

Six genome sequences (Table 1) were collected from the NCBI database. All contigs of each draft genome were reordered by the MAUVE v2.3 reorder tool (Darling et al. 2004) using A. xylosoxidans A8 as a reference genome and then joined to form a pseudo-chromosome. Finally, all genomes were submitted to RAST (Aziz et al. 2008) for high-quality re-annotation, for eliminating the deviation due to different annotation methods, and glimmer algorithm was selected to call gene.

Carbon source utilization test

Glucose utilization was determined by API 20 NE kit (bioMerieux, France). In addition, d-galactonate (127 mM) or d-mannose 6-phosphate (96 mM) was added respectively into minimal medium included in API 20 NE kit to observe the growth of A. arsenitoxydans SY8.

Whole-genome alignment

ProgressiveMAUVE from the MAUVE v2.3 (Darling et al. 2010) was used to perform a multiple genome comparison of these six Achromobacter genomes with the default parameter. Locally collinear blocks (LCBs) (minimum threshold size = 500 bp) of each genome were extracted from the whole genome alignment by stripSubsetLCBs and were concatenated using a custom perl script. The six concatenated core genomes (on average, 3.37 Mb for each genome) were imported to DnaSP v5 (Librado and Rozas 2009), and nucleotide diversity (π) was calculated.

Ortholog clustering analysis

To assess the genome diversity for Achromobacter members at gene level, OrthoMCL package (Li et al. 2003) was used to determine the core and pan genome size. All the CDSs of the six genomes were merged together and searched against itself, based on the blastp algorithm with E-value cutoff 1e-5 and coverage ≥ 50 %. Then, all homologous proteins were parsed out from the blast results through a series of in-house perl scripts and were grouped into orthologous families by cluster tool MCL (Markov Cluster Algorithm), with an inflation value of 1.5 (Enright et al. 2002).

Phylogenomic analyses

With a close phylogenetic relationship among Achromobacter, Alcaligenes, and Bordetella, a total of 18 genomes (six Achromobacter spp., two Alcaligenes spp., and ten Bordetella spp.) were included for phylogenetic analysis. Based on blastp analysis (E-value > 1e-5; identity > 50 %; coverage > 90 %), the 18 genomes contained 436 conserved genes that had exactly one member per genome, and the lengths of each of the genes were nearly identical. We used ClustalW (Thompson et al. 1994) to align the protein sequences of these genes and concatenated the individual gene alignments into a string of amino acids for each genome. The concatenated alignment data were used to infer phylogenies using MEGA 5.05 (Tamura et al. 2011) with the neighbor-joining (NJ) algorithm or PhyML 3.0 with the maximum-likelihood (ML) algorithm. To estimate tree reliability, a bootstrap method with 1,000 bootstrap repetitions was computed.

The progressiveMAUVE was also used to align the 18 genomes and create a phylogenetic guide tree.

Recombination analysis

ClonalFrame uses a Markov Chain Monte Carlo algorithm to infer the microevolution of closely related bacteria (Didelot and Falush 2007). Here, it was used to infer recombination rate among the six Achromobacter members including both clinical and environmental isolates. The core-genome alignment (LCB cutoff = 2 kb) was extracted from whole-genome alignment by stripSubsetLCBs and was imported to ClonalFrame 1.2 for analysis. In this study, ClonalFrame run 200,000 iterations (including 100,000 burn-in iterations) and sampled data every 100 iterations. Outputs of two parallel runs indicated a good convergence.

Identification of virulence factors

The virulence factor database (VFDB) (Chen et al. 2005) was downloaded. All predicted genes of the six Achromobacter members were searched against the VFDB by blastp with loose criteria (E-value ≥ 1e-5; identity ≥ 35 %; coverage ≥ 70 %). The putative virulence genes were analyzed and determined by combination with the results of genomic comparison between the clinical and environmental samples.

Results

Genome features

To date, Achromobacter has six available genome sequences including one complete and five draft genomes (Table 1). Based on this information, Achromobacter has a relatively large genome size (6–7 Mb) with a high GC content (65–68 %), which may reflect the intrinsic resistance to hazardous conditions. A larger genome size (>5 Mb) usually holds more genetic determinants, which indicates versatile metabolisms with a strong adaptation to volatility niches (Ochman and Davalos 2006). With a high GC content, the genome tends to be quite stable and able to bear DNA damage under adverse conditions. No reliable CRISPRs element (Clustered Regularly Interspaced Short Palindromic Repeats) was detected in the six Achromobacter genomes via the de novo identification tool CRT and the tool based on similarity search, CRISPRAlign (Rho et al. 2012). These genomes contained small numbers of insertion elements (ISs) relative to their large genomic sizes (Table 1). The number of IS elements were significantly variable among the genomes, however, an obvious difference in number between clinical and environmental isolates could not be observed. Their genomes possessed many phage regions (we use this name instead of prophage because some prophages were broken by contigs), which might be explained by the absence of immune system, CRISPRs (Barrangou et al. 2007). Numerous phage elements may endow genome variations in the evolution of Achromobacter spp.

Whole-genome comparisons

We performed a multiple genome comparison of these six Achromobacter spp. using the progressiveMauve, in the MAUVE v2.3 (A. xylosoxidans A8 used as the reference genome) and obtained 1,948 locally collinear blocks (called LCBs, cutoff length = 500 bp) shared among the six genomes. The largest LCB was just 18.5 kb in length. Such a small length may be due to genome rearrangement events and the discontinuous status of the draft genomes, since the alignment of genomes indicated evidence of numerous genome rearrangement events and poor synteny among them (Fig. S1 in the Supplementary Material). The multiple alignment of the “core” region made up an average of 52 % of each Achromobacter genome. By concatenating this aligned core region, the six genomes had a nucleotide diversity (π) value of 0.13, which means an approximate genus-wide nucleotide sequence homology of 87 %.

Core and pan genome are usually employed to evaluate genome diversity within a species or closely related bacteria. A core gene set of a species is the total gene numbers shared by all strains, which are the genetic determinants to maintain the property of this species. Pan genome is a total gene pool reflecting the housing capacity of the genetic determinants. To eliminate the deviation from different annotation methods, the six genomes were first submitted to RAST for a high-quality re-annotation, and then the re-annotated genomes were submitted to OrthoMCL software for ortholog cluster analysis. The OrthoMCL results indicated that six genomes had a core gene set size of 3,398 orthologous families and a pan genome size of 10,750 orthologous families. This core constituted averagely 56.4 % of each Achromobacter genome, slightly larger than the value inferred from the core region at the nucleotide level. The pan genome was 1.8 times the average size of these six genomes and made up 29.7 % of the total gene number of these six genomes. The formation of a large gene pool for those Achromobacter members implied their open pan genome structures and to a certain extent endowed them with the capability of adaptation to the surrounding environments or their hosts.

Achromobacter has an overall similar metabolism architecture because an analogous proportion of cluster of orthologous groups families (Tatusov et al. 2000) is distributed among the six strains, except that A. xylosoxidans A8 has a remarkably high proportion of genes associated with lipid transport and metabolism, as well as secondary metabolite biosynthesis (Fig. S2 in the Supplementary material).

Phylogenomic and evolutional analyses

We first performed a 16S rRNA gene analysis and revealed close phylogenetic relationship among the genera Achromobacter, Alcaligenes, and Bordetella (Fig. 1a). Then, the six Achromobacter members plus 12 genomes from the genera Alcaligenes and Bordetella were included in phylogenomic analysis based on conserved proteins. Using a bidirectional blastp approach, we detected 436 conserved proteins across the 18 genomes which had exactly one member per genome, and the lengths of each of the proteins were nearly identical. This conserved protein set was aligned using ClustalW, and the individual gene alignments were concatenated into a string of 150,496 amino acids for each genome. The concatenated alignment data were modified to remove ambiguous amino acids and were subsequently used to build a NJ tree by MEGA 5.05 (Fig. 1b) and a ML tree by PhyML 3.0 (Fig. S3 in the Supplementary material). The topology of the NJ tree was nearly identical to that of the ML tree and the Mauve guide tree (Fig. 1c). The NJ tree displayed a well resolved topology compared with that of the tree based on 16S rRNA genes (Fig. 1a). This could be explained by the limited resolution for phylogenetic inference using the 16S rRNA gene. As shown in Fig. 1b, the six Achromobacter members grouped together and were distinguishably distant from the Alcaligenes members. More importantly, the entire group of Achromobacter spp. mingled with the Bordetella strains and showed the closest relationship to Bordetella petrii DSM12804, the only non-pathogenic species in the Bordetella genus (Gross et al. 2008).

Fig. 1
figure 1

Phylogenetic analyses among three related genera Achromobacter, Alcaligenes, and Bordetella members. The NJ algorithm tree constructed based on 16S rRNA genes of 18 strains from Achromobacter, Alcaligenes, and Bordetella by the MEGA 5.05 (a); the NJ tree constructed based on 436 conserved proteins shared among the 18 strains by the MEGA 5.05 (b); the Mauve guide tree of the 18 strains based on whole-genomic similarity at the nucleotide level through multiple genome comparison tool Mauve (c); and the maximum-likelihood (ML) tree constructed based on concatenated sequences of 11 conserved proteins of the type III secretion system by phyML in 11 strains from Achromobacter and Bordetella (d). The accession numbers of the sequences used in the phylogenetic reconstruction are listed after each strain

To verify the phylogenomic result based on the 436 conserved proteins, average nucleotide identity (ANI) was employed to infer the phylogenetic relationship among the genera Achromobacter, Alcaligenes, and Bordetella. The ANI analysis was considered to be a robust method to compare the genetic relatedness among strains (Konstantinidis and Tiedje 2005; Richter and Rossello-Mora 2009). The results indicated that the ANI means between B. petrii DSM12804 (or Bordetella avium 197 N) and Achromobacter members were quite similar to, and even greater than, the ANI means between B. petrii DSM12804 (or B. avium 197 N) and some other Bordetella species (Fig. 2). The ANI analytical results strongly supported the results of the phylogenomic analyses that Achromobacter strains were more closely related to B. petrii DSM12804 and B. avium 197 N than some other Bordetella strains. Whereby, it may suggest a most recent differentiation from a common ancestor for these two genera, even though there was a significant difference of whole genome similarities between the two genera (Fig. 3).

Fig. 2
figure 2

Evolutionary distance evaluation for the Achromobacter, Alcaligenes, and Bordetella genera based on ANI analysis. Blue squares represent ANI values between a pair of genomes within Achromobacter. Red triangles represent ANI values between a pair of genomes within Bordetella (five represented strains, as shown in Fig. 4). Green circles and purple diamonds indicate ANI values between B. avium 197 N or B. petrii DSM12804 and the other strains, which belong to Achromobacter and Alcaligenes, respectively

Fig. 3
figure 3

General genomic comparison of the potential CF pathogen (A. xylosoxidans AXX-A) to the other five strains of Achromobacter and five strains of Bordetella based on protein sequence similarities. From outside to inside, ring 1 (deep blue) and 2 (light blue) show ORF encoded from forward/reverse strand; meanwhile tRNA is marked by red color; Ring 3, the boundary between two contigs (red line); Ring 4 to 13 represent genomes ordered by greatest overall similarity to A. xylosoxidans AXX-A: A. xylosoxidans C54, A. piechaudii HLE, A. xylosoxidans A8, A. arsenitoxydans SY8, A. piechaudii ATCC 43553, B. petrii DSM12804, B. bronchiseptica RB50, B. parapertussis 12822, B. avium 197 N, and B. pertussis Tohama I. The color bar as shown at right top indicates the corresponding protein identity; the inside two rings denote G + C content and G−C/G + C skew with 10 kb windows, respectively. The outside red bars denote the unique regions to A. xylosoxidans AXX-A, compared with two environmental isolates A. xylosoxidans A8 and A. piechaudii HLE (numbered from 1 to 26)

ClonalFrame was performed to infer microevolution for the six Achromobacter genomes. This tool is a statistical model employing the Bayesian algorithm to infer genealogy and to calculate the recombination rate (r/m) among closely related bacteria. The parameter r/m represents the rate at which nucleotides are substituted by recombination or point mutation, which weights the actual impact of homologous recombination on genome evolution. We carried out a ClonalFrame analysis based on the core region of multiple genome alignments (on average 3.37 Mb per genome). The results demonstrated that a fairly low recombination rate has occurred within the Achromobacter genus, since Achromobacter had a small value of r/m = 0.51 with 95 per credibility interval [0.42, 0.59] compared with that of many other bacteria (Vos and Didelot 2009). These results illustrated that recombination contributed to only a 0.51 proportion of nucleotide change relative to point mutation in the evolution of Achromobacter genomes.

Basic nutritional metabolism

All of the six Achromobacter members were predicted to have an intact TCA cycle and a nearly complete glycolysis pathway, which could not convert the glucose into d-glucose 6-phosphate or the intermediate products d-glucose 1-phosphate in the initial steps, because none of the catalytic enzymes (such as hexokinase, glucose-6-phosphatase, polyphosphate glucokinase, and ADP-dependent glucokinase) were found in their genomes. Although A. xylosoxidans A8 could transform glucose into gluconic acid, the absence of downstream enzymes may stop the gluconic acid from further utilization. This result was coordinated with experimental data that A. arsenitoxydans SY8 (Table S1 in the Supplementary Material) and A. piechaudii (Vandamme et al. 2013) could not assimilate glucose. However, our observation was inconsistent with the previous report that A. xylosoxidans could grow using glucose as the sole carbon source (Brenner et al. 2005). A. xylosoxidans was considered to utilize d-xylose (xylAB pathway) (Gu et al. 2010), but we could identify only xylB in their genomes. The Achromobacter strains were therefore assumed to contain other unknown pathways or the single xylB enough for xylose degradation. From the genomic information, we predicted that Achromobacter strains were capable of using galactonate and d-mannose 6-phosphate as the sole carbon sources for growth. The carbon source utilization tests of A. arsenitoxydans SY8 supported the above predication (Table S1 in the Supplementary Material). In addition, d-ribose may be used only by A. xylosoxidans AXX-A.

A great potential for biodegradation of aromatic compound in Achromobacter

Microbes play a crucial role in aromatic compound degradation. Aromatic compound degradation usually includes two main steps: (1) peripheral reactions, transforming a large number of aromatic derivatives into a limited number of central intermediates catalyzed mainly by oxygenase; and (2) ring cleavage reactions, ring cleavage of central intermediates catalyzed by dioxygenase. The subsequent products are subjected to the TCA cycle for burning. Achromobacter genomes covered numerous genes involved in aromatic compound metabolism. To systematically investigate their aromatic catabolic function, we built a key oxygenase marker database of aromatic catalytic, including all 48 oxygenase markers that sheathed all aromatic peripheral reactions and ring-cleavage pathways (Perez-Pantoja et al. 2012). We searched all the genes of the six Achromobacter genomes against this self-built database and listed the annotation results in Table S2 in the Supplementary material. The results indicated that A. xylosoxidans A8 possessed the largest numbers of oxygenase markers (26) than the other five strains, nearly twice as many as A. arsenitoxydans SY8 (14) (Table S2 in the Supplementary material). Such robust ability of aromatic compound degradation in A. xylosoxidans A8 may be acquired through recent horizontal gene transfer (HGT) events driven by habitat adaptation. However, 13 key oxygenase markers were shared across all the six Achromobacter strains, underlining the intrinsic potential of aromatic catalysis for the Achromobacter genus (Table S2 in the Supplementary material). Taken together, a total of 31 oxygenase markers were identified in the Achromobacter genomes, covering nearly 2/3 of the key enzymes of aromatic catalysis, revealing its advantages for the degradation of aromatic compounds.

Innate multidrug resistance in Achromobacter

Bacterial multiple drug resistance (MDR) is a major challenge in clinical treatment of bacterial infection. Numerous researches had proven that HGT spreading antibiotic genes among bacteria was responsible for MDR. In addition to such acquired resistance, some bacteria like P. aeruginosa exhibited intrinsic resistance to antimicrobials. In this study, we found that all the six Achromobacter members encoded an abundance of pumps associated with MDR (the number ranged from a minimum of 40 for A. xylosoxidans C54 to a maximum of 53 for A. xylosoxidans AXX-A). Larger genome has been reported to possess greater numbers of MDR efflux pumps (Piddock 2006). Compared with Bordetella and other genera, the unusually high number of pumps relative to their genome sizes of Achromobacter strains may reveal the close association of the MDR efflux pumps for the intrinsic resistance. For instance, recent experiments had verified that the MDR efflux systems AxyABM and AxyXY-OprZ in A. xylosoxidans AXX-A were responsible for resistance to multiple antibiotics (Bador et al. 2013, 2011). In addition to contributing to antimicrobial resistance, an important role of MDR efflux pumps in virulence has also been reported (Piddock 2006).

Unique genomic features discriminating the CF-source isolate from the environment-source isolates

We selected three strains to investigate the genomic divergence between CF potential pathogen (represented strain, A. xylosoxidans AXX-A) and environmental isolates (represented strains, A. xylosoxidans A8 and A. piechaudii HLE). Choosing these three strains rather than all strains was based on the following considerations: (1) Among the three clinical isolates, A. piechaudii ATCC 43553 was isolated from a wounded nose and is regarded as a non-pathogen (Table 1). Both A. xylosoxidans strain AXX-A and strain C54 were CF-source samples and had a high degree of similarity on the genome scale. Compared with strain C54, strain AXX-A had relatively perfect genomic information and was investigated more often than strain C54 in the clinical area (Amoureux et al. 2012; Bador et al. 2013, 2011), so we chose only strain AXX-A for further analysis rather than strain C54 in an elimination strategy, and (2) among the three environmental isolates, A. arsenitoxydans SY8 was not a pure environmental sample because its niche linked to an animal habitat (Cai et al. 2009; Li et al. 2012).

Compared with A. xylosoxidans A8 and A. piechaudii HLE, A. xylosoxidans AXX-A had 1,068 unique genes which included 26 specific regions (Fig. 4). These strain-specific genes and regions are listed in Table S3 in the Supplementary material, and the regions are numbered, and their position are shown in Fig. 3 and summarized in Table 2. Overall, of the 1,068 strain-specific genes, 40 % were annotated as hypothetical proteins and nearly 21 % encoded enzyme products, which may indicate this strain’s metabolic robustness of adaptation to CF infection-related habitats. Analysis of the A. xylosoxidans AXX-A unique genes demonstrated that many of them were putatively responsible for its pathogenicity, mainly including type III secretion system (T3SS), a ‘polysaccharide island’, lipopolysaccharide (LPS) O-antigen, and several cellular toxins (Table 2 and Table S3 in the Supplementary material; see next section for detailed explanation).

Fig. 4
figure 4

Genome comparison among the potential CF pathogen (A. xylosoxidans AXX-A) and the two environmental bacteria (A. xylosoxidans A8 and A. piechaudii HLE). Venn diagram illustrates the number of orthologous families unique or shared among the three genomes. The proteins were grouped by OrthoMCL software with an inflation value of 1.5. The numbers in brackets show the predicted proteomic sizes of the corresponding genome

Table 2 The unique regions to the CF clinical isolate A. xylosoxidans AXX-A, compared with the two environmental isolates A. xylosoxidans A8 and A. piechaudii HLE

In contrast, the two environmental isolates (A. xylosoxidans A8 and A. piechaudii HLE) lacked definite strain-specific virulence factors but possessed the unique metabolic features that enable them to survive in host-free environments. A. xylosoxidans A8 contained one chromosome and two plasmids, including a total of 1,639 unique genes (Table S4 in the Supplementary material and Fig. 4). Twenty-six strain-specific (di)oxygenases were found to be associated with aromatic compound degradation (Table S4 in the Supplementary material). Likewise, A. xylosoxidans A8 carried entire gene clusters for degradation of chlorobenzoate (mocpRABCD) and salicylate (hybRABCD) (Jencova et al. 2008; Strnad et al. 2011). In addition, several unique features (like pyoverdine synthesis and heavy metal resistance) may also facilitate A. xylosoxidans A8 to adapt to the aromatic contaminated niche. As for A. piechaudii HLE, its genome contained a high proportion of mobile genetic elements, represented by two phage regions and several genomic islands that collectively covered over 301 kb (near 5 % of the genome size) and occupied nearly half of the whole unique genes (421/938 genes) (Table S5 in the Supplementary material). One large phage region, 186 kb in length, spanned multiple contigs and carried several genes encoding components of type IV secretion systems (T4SS) and putative RTX toxins (the contigs were reordered using the complete reference genome of A. xylosoxidans A8). Its genome contained an “arsenic island” (20 genes, Table S5 in the Supplementary material) (Trimble et al. 2012), accounting for arsenic resistance and transformation. This arsenic island was also present in A. arsenitoxydans SY8 (Li et al. 2012) but absent from the other four strains. In A. arsenitoxydans SY8, this island was predicted to have been acquired via a recent HGT event, since its flanking sequences were nearly identical to those of A. xylosoxidans A8. A. piechaudii HLE might also have acquired the arsenic island through HGT, as a relatively high degree of identity (78 % at nucleotide level) is shared between these two islands with nearly identical gene organization. In addition, A. piechaudii HLE had a PFGI1-like cluster (mobile genomic island), 93 kb in size, which included plasmid-related elements and metal resistance-related genes (Co, Cd, Zn, Hg, and As). Overall, A. piechaudii HLE harbored a lot of strain-specific genes to adapt to metal(loid)-related environments.

Unique genes related to pathogenicity in A. xylosoxidans AXX-A

As mentioned above, numerous strain-specific genes of A. xylosoxidans AXX-A were potentially linked to its pathogenicity.

Region 1 was composed of 33 genes encoding an entire type III secretion system (T3SS). A similar T3SS was also discovered in A. xylosoxidans C54 and A. arsenitoxydans SY8 (Li et al. 2012) but was missing from the other three strains (A. xylosoxidans A8, A. piechaudii strains HLE and ATCC 43553) (Fig. 3). Numerous researches have demonstrated that T3SS was a pivotal virulence factor for pathogens (like Bordetella, Yersinia, and Salmonella) and was found only in Gram-negative pathogens and symbiotic bacteria (Coburn et al. 2007). Consequently, the T3SS should contribute to the pathogenicity of A. xylosoxidans AXX-A. Region 23 contained a gene set encoding the LPS O-antigen (22.8 kb). LPS O-antigen was the main endotoxin of Gram-negative pathogens and triggered an immune response in the host. Besides Region 23, two genes of Region 6 (AXXA_22195 and AXXA_22200) were also related to O-antigen formation. Region 24 was linked to capsular and cellulose biosynthesis which formed a “polysaccharide island” (36.0 kb) as missing from the other nine strains but present in B. avium 197 N (Fig. 3). Both capsular and cellulose were the components of the cell wall. Capsular is considered a virulence factor and enhances the ability of bacteria to cause disease by protecting it from engulfment by the host immune system, such as macrophages (Smith et al. 1999). Cellulose was produced only by a small part of the bacteria and may be related to formation of biofilm and bacterial infection (Romling 2002). Whereby, this polysaccharide island potentially contributed to the virulence of A. xylosoxidans AXX-A (Wiley et al. 2012). Region 13 included a gene involved in the activation of hemolysin, a type of toxin for host by damaging the cell membrane of the red blood cells. Additionally, there were five unique regions in the A. xylosoxidans AXX-A genome that encoded products related to prophage (Region 3, 11, 12, 17, and 18). Whether these regions are related to its pathogenicity was unknown.

Four unique proteins were putatively involved in the degradation of mRNA and protein, including one endoribonuclease L-PSP (AXXA_06053) and three proteases (AXXA_02143, AXXA_05413, and AXXA_23570). These proteins may serve as the cellular toxins for attacking the host cell. One gene encoding diguanylate cyclase (AXXA_30537) was found in A. xylosoxidans AXX-A genome. This enzyme catalyzed the formation of the second messenger, cyclic-di-GMP, which was reported to be important for survival of intracellular pathogens with their hosts (Newell et al. 2011). Likewise, approximately 20 unique genes encoding functions related to iron transport and utilization (including Region 7) were detected in A. xylosoxidans AXX-A genome (Table S3).

The non-unique genes related to pathogenicity in A. xylosoxidans AXX-A

In addition to potential virulence factors from the strain-specific gene section of A. xylosoxidans AXX-A, we further analyzed virulence-related genes from its non-unique gene section (the total genes of A. xylosoxidans AXX-A subtracting the 1,068 strain-specific genes).

During lung infection in individuals with CF, pathogens must thrive in the altered mucus of the lungs by successful adherence, resistance to the host immune system, and colonization (Saiman 2004). In A. xylosoxidans AXX-A genome, we identified five O-antigen genes (AXXA_01150, AXXA_01195, AXXA_01200, and AXXA_09588) and an accessory colonization factor AcfC (AXXA_16582). A. xylosoxidans AXX-A had 28 genes involved in flagella biosynthesis. The flagella of the pathogens played a role in biofilm formation and other pathogenic adaptations (such as motility and macrophage invasion). The Vi capsular polysaccharide was an important factor during infection by resistance to immune evasion (Hirose et al. 1997). We found five genes (AXXA_06708, AXXA_06713, AXXA_24150, AXXA_24160, and AXXA_24190) that were implicated in Vi capsular polysaccharide synthesis. Furthermore, three superoxide dismutases (AXXA_06823, AXXA_27300, and AXXA_30127) and a redox-sensitive transcriptional activator SoxR (AXXA_24890) participating in oxidative stress tolerance were identified. A. xylosoxidans AXX-A contained a set of nine genes (AXXA_25765-AXXA_25800) participating in alcaligin (a type of siderophore) biosynthesis. Alcaligin is a high-affinity iron chelating agent secreted by bacteria to facilitate iron utilization when facing to an iron-limited situation (e.g., the human body). The alcaligin biosynthesis cluster plus the above-mentioned unique genes related to iron transport and utilization in A. xylosoxidans AXX-A revealed that its capability to overcome an iron-limited condition colonized on the host (Moore et al. 1995). Phosphorus was also an essential limiting factor for growth of pathogens during infection of the host. For example, in plant pathogen Agrobacterium tumefaciens, phosphorus limitation promoted formation of biofilm via the two-component signal transduction system phoRB (Danhorn et al. 2004). Genome analysis indicated A. xylosoxidans AXX-A possessed the phoRB regulon (AXXA_13007 and AXXA_13012), one phosphate transport system regulator PhoU (AXXA_19502) and a set of high-affinity phosphate transport system pstSCAB (AXXA_03142-AXXA_03157). These systems may be employed to adjust the strategy of phosphorus utilization when facing phosphorus-limited situations.

Pathogens are often equipped with “weapons” (like toxins or effector proteins) to provide the capability for invasion of the host cell. The secretion system was an essential component for pathogenic and symbiotic bacteria, which played a role in transporting the secreted proteins (effectors) during bacterium/host interactions. Besides the unique T3SS, A. xylosoxidans AXX-A had five genes (AXXA_21910, AXXA_21975, AXXA_24775, AXXA_24800, and AXXA_24795) encoding the components of a type II secretion system (T2SS), which could secrete the toxins and enzymes into the extracellular fluid by CF pathogen P. aeruginosa (Filloux et al. 1998). Two genes encoded endotoxin lipooligosaccharide (LOS, AXXA_18121 and AXXA_18126), and four phospholipase C (AXXA_05283, AXXA_17181, AXXA_19507 and AXXA_22000) were found in its genome. Phospholipase C was the toxins for the host cell by degrading the phospholipid surfactant to reduce the surface tension of the alveoli (Songer 1997). Genes required for persistent infection, like icl/aceA (AXXA_02462) were also identified.

Quorum sensing (QS) has been reported to participate widely in regulating pathogenicity in many pathogens (Hentzer et al. 2003; Rutherford and Bassler 2012), but we could not find any known QS genes in Achromobacter genomes. This finding agreed with the observation that none of the QS-related genes were found in its closely related genus Bordetella, where the pathogenic mechanism has been well characterized (Mattoo and Cherry 2005; Mattoo et al. 2001).

The virulence factor T3SS is very conserved in Achromobacter and Bordetella

Of the 16 strains in Achromobacter and Bordetella, the T3SS system was found in all mammal/human pathogens but not in a bird pathogen B. avium 197 N. Strain 197 N may have lost the T3SS system during its adaptation to a poultry host. We compared the T3SS in 11 strains and identified 11 conversed proteins. Phylogenetic analysis based on concatenated 11 conserved proteins of T3SS found a similar topology with the core genome-based phylogenomic tree (Fig. 1b, d). This finding indicated that the T3SS is very conserved and most probably was vertically transferred from a common ascendant.

Discussion

From the extensive genomic survey of the six Achromobacter members, a core set of 3,398 orthologous families could roughly reflected the genetic contents to define Achromobacter. Based on core and pan-genome analysis as well as nucleotide diversity (π) calculation, this genus displayed a considerable diversity on the genome scale. Ordinarily, a pair of genomes within a species has a cut-off mean ANI of 95 % (Konstantinidis and Tiedje 2005; Richter and Rossello-Mora 2009). The ANI value largely less than 95 % for both the A. xylosoxidans and A. piechaudii (Fig. 2) suggested a significant diversity within those species.

Recombination analysis revealed that a relatively low rate of genetic material exchange occurred among Achromobacter strains because of a small mean r/m = 0.51 (Vos and Didelot 2009). However, it should be noticed that the r/m value was most likely underestimated because these strains were from different ecological environments, which could lead to a geographical barriers of close contact necessarily for exchange of genetic information (Vos and Didelot 2009). Homologous recombination was considered one of the major forces in bacterial evolution (Didelot and Maiden 2010), which imported heterologous material in vivo to improve their ability for adaptation to novel habitats and eventually facilitate speciation. Oppositely, homologous recombination could also remove the variable region of chromosome to maintain genome integrity (Shapiro et al. 2012). Given the latter case, a low recombination rate occurred among Achromobacter strains led to a positive effect on their genomic diversity.

Compared with the 16S rRNA-based phylogenetic method, our result indicated phylogenomic analysis is a robust approach to distinguish Achromobacter and Alcaligenes. In the past, some Achromobacter strains were still assigned to Alcaligenes, such as A. xylosoxidans (Saiman et al. 2002). Here, we found that there are distinct boundaries between the two genera in phylogeny. With the rapid development of high-throughput sequencing technology, we believe that phylogenomics represents a reliable and high resolved approach for the classification of closely related organisms.

Phylogenomic and ANI analyses indicated that Achromobacter had the closest relationship with Bordetella and even suggested a very recent common ancestor for these two genera. Bordetella included two phylogenetically distant groups, one group (classical Bordetella subspecies) represented by Bordetella pertussis, Bordetella parapertussis, and Bordetella bronchiseptica (Parkhill et al. 2003) and another group represented by B. avium 197 N and B. petrii DSM12804 (Sebaihia et al. 2006). B. petrii DSM12804 was considered the sole environmental species of Bordetella, as lacking the majority of the virulence genes (e.g., cytotoxic, T3SS or T4SS) (Gross et al. 2008). A previous study proposed B. petrii DSM12804 is a bridge that links pathogens (Bordetella spp.) and environmental isolates (Achromobacter spp.) (Gross et al. 2008). Our result is consistent with this hypothesis but further underlined the close relationship between those two genera and may even suggest the possibility to assign them into a “supergenus.” The similar GC content range (61–68 %) shared by the Achromobacter and Bordetella members seems to strengthen this possibility of formation of a supergenus, since members within the same genus generally have a similar GC content (Lightfield et al. 2011). In addition, a phylogenetic tree based on the conserved proteins of virulence-related factor T3SS is very similar to the core genome phylogeny indicating that T3SS of Achromobacter and Bordetella members were most probably vertically inherited from their common ascendant (Fig. 1b, d). In light of the immediate phylogenetic relatedness, we proposed that the Achromobacter members, along with the Bordetella ones, perhaps inherited the genetic loci related to pathogenicity from the common ancestor, while they were differentiated along different branches of the phylogeny. When sufficient genomic divergence was accumulated, this divergence led to speciation (Cohan 2001). A low degree of overall genome similarity of the gene contents for Achromobacter and Bordetella (Fig. 3) may be the consequence of an evolution process. Nevertheless, the phylogenetic signal tracing evolutionary history of this supergenus was still rather robust. From an evolutionary view, this phylogenetic signal could prompt us to interpret the potential of pathogenesis for Achromobacter (e.g., A. xylosoxidans AXX-A).

So far, at the molecular level, little is known about the mechanism of pathogenesis for this potential opportunistic pathogen. For the first time, we revealed the difference of genomic contents between CF-source and environment-source isolates of Achromobacter and identified the putative genetic determinants participated in CF infection by A. xylosoxidans AXX-A. These genes encoded functions mediating the bacterium/host interactions, adherence, colonization, and invasion, which promisingly enabled A. xylosoxidans AXX-A to adapt to a pathogenic lifestyle. Therefore, our study provides genomic evidence that some Achromobacter strains were most probably the actual pathogens, since Achromobacter spp. was still considered as a suspected pathogen because of the lack of definite clue of infection in CF patient. Although several elements potentially involved in pathogenesis were also found within the genomes of environmental isolates, these elements appeared not to be sufficient for host infection. Meanwhile, the T3SS found in A. arsenitoxydans SY8 may suggest the capability of Achromobacter to infect animals (like pig). The key adaptability genetic determinants in A. xylosoxidans AXX-A, A. xylosoxidans A8, and A. piechaudii HLE were unique to their genomic implying that HGT played an essential role in adaptation to specific niches. Finally, despite threats to health, Achromobacter exhibits the attractive properties of aromatic compound degradation and heavy metal resistance.