Introduction

Potato (Solanum tuberosum) is considered as the fourth main food crop in the world and widely cultivated because no special growth environment is required (Larkin and Honeycutt 2006). Potato is surpassed only by rice (Oryza sativa), wheat (Triticum aestivum) and maize (Zea mays) in terms of crop production and cultivated areas (King and Slavin 2013). However, various types of pathogens including bacteria, virus, fungi and pests can cause severe economic losses to potato yields (Hagland 2011). Blackleg of potato, representing one of most important bacterial diseases, is caused by several bacterial species belonging to the soft rot Enterobacteriaceae (SRE) family. It has been documented that Pectobacterium carotovorum subsp. brasiliense (Pcb) (Duarte et al. 2004), Pectobacterium atrosepticum (Pa) (Gardan et al. 2003), Pectobacterium carotovorum subsp. carotovorum (Pcc), Pectobacterium wasabiae (Pwa) (Pitman et al. 2008) and several Dickeya spp. are identified as causal agents for potato blackleg in the field (Toth et al. 2011; van der Wolf et al. 2014). However, P. atrosepticum (formerly named Erwinia carotovora subsp. atrosepticum) (Hauben et al. 1998), only restricted to potato, is regarded as the predominant blackleg pathogen occurred in temperate regions, which is mostly characterized by symptoms such as black rot lesion (Gardan et al. 2003). Currently, no methods are effective to control the disease caused by P. atrosepticum. There are no available chemical agents to prevent the spread of these pathogens. In addition, planting patterns and storage conditions are not efficient against the disease (Czajkowski et al. 2011; Yaganza et al. 2014).

Except of Pectobacterium and Dickeya, the bacterial family Enterobacteriaceae also includes many well-studied human pathogens such as Shigella, Yersinia, and Salmonella, as well as the model species Escherichia coli. Many genome sequences have been reported from this family (Parkhill et al. 2001a, b; Perna et al. 2001; Welch et al. 2002; Duchaud et al. 2003). Especially, the genome sequence analysis of E. coli K-12 offers crucial information for genome development, because this strain is widely referred as the standard model strain in almost all areas of biological researches (Blattner et al. 1997). While the genome analysis of bacterium in Enterobacteriaceae has provided the comprehensive data for scientific researches, limited genetic information of Pectobacterium genus, playing a key role in Enterobacteriaceae, has been collected.

In present, totally seven genome sequences of Pectobacterium are now available and enable us to perform the comparative genomics analysis on these species including two strains of P. carotovorum (PC1 and PCC21), two strains of P. atrosepticum (SCRI1043 and CFBP6276) (Bell et al. 2004a; Kwasiborski et al. 2013), three strains of P. wasabiae (WPP163, CFBP3304 and SCC3193) (Koskinen et al. 2012; Nykyri et al. 2012). In previous researches on these strains, pathogenic genes encoding effector proteins and the secretion system have an effect on plant disease. Moreover, soft rot pathogenesis basically relies on the plant cell wall degrading enzymes (PCWDEs) that can lead to extensive tissue maceration (Expert 1999; Franza et al. 2002). However, the mechanism may be more complex and subtle than previous theoretical results (Mulholland et al. 1993; Jones et al. 1999; Kang et al. 2002; Toth et al. 2003). Here we present the complete genome sequence of P. atrosepticum JG10-08 to reveal its molecular pathogenesis, which could supplement further references for pathogenic mechanisms. This research could also provide a key source of new genetic materials and present a target for biological therapy.

To date, blackleg disease of potato is becoming prevalent in potato-growing regions in China. However, the genome sequence of P. atrosepticum strain isolated from China remains unknown. Here we present the complete genome sequence of P. atrosepticum JG10-08 obtained from infected potato tubers, performed the comparative genomics analysis with those of five annotated SRE and identified several strain-specific genes that potentially contribute to virulence.

Materials and methods

Bacterial strain and genomic DNA extraction

Pectobacterium atroseptica JG10-08 was isolated from the potato tuber with blackleg disease symptoms in Heibei, China. This strain was typically incubated in Luria–Bertani (LB) liquid medium at 25 °C for 48 h. Genomic DNA was extracted from cultured bacteria using CTAB method (Hyman et al. 2000).

Sequencing the whole genome

The genome sequencing of P. atrosepticum JG10-08 was performed by Illumina HiSeq2000 platform (2 × 100 bp) and the total sequencing coverage was 95-fold. The obtained high-quality paired-end reads were assembled by using SOAP denovo (http://soap.genomics.org.cn) and SOAP GapCloser was also applied to close the gaps after assembly (Luo et al. 2012). Subsequently, a draft genome with 25 scaffolds was generated. The gaps between scaffolds were closed via PCR and Sanger sequencing.

Glimmer was used to determine and functionally categorize the predicted protein-coding sequences (Delcher et al. 1999). The genome sequences were annotated through Rapid Annotations using Subsystem Technology (RAST) (Aziz et al. 2008), identified by performing manual NCBI BLAST searches, and compared with the coding sequences (CDSs) of the genome of P. atrosepticum SCRI1043, P. carotovorum subsp. carotovorum PC1, P. carotovorum subsp. carotovorum PCC21, P. wasabiae SCC3193 and P. wasabiae WPP163. Comparative genome analyses for functions were conducted using Clusters of Orthologous Group of proteins (COGs).

Phylogenetic analyses

Phylogenetic relationships were analyzed based on both the genome sequences and 16S rRNA genes among six Pectobacterium species and one Yersinia species, including P. atrosepticum JG10-08 (GenBank accession no. CP007744.1), P. atrosepticum SCRI1043 (BX950851.1), P. carotovorum subsp. carotovorum PC1 (CP001657.1), P. carotovorum subsp. carotovorum PCC21 (CP003776.1), P. wasabiae SCC3193 (CP003415.1), P. wasabiae WPP163 (CP001790.1) and Yersinia pestis CO92 (CP009973.1). The ClustalW software was used to align the gene sequences after trimming to remove ambiguously aligned regions. The phylogenetic tree was performed using the neighbor-joining method of Phylip.

According to the results of phylogenetic analyses, strain SCRI1043, PCC21 and WPP163 were chosen to compare with the strain JG10-08 on synteny using Mauve. Next, the similarity of locally collinear blocks (LCB) was obtained through Gene Nees.

Analysis of related pathogenic genes

The genome sequences and the function annotation database of P. atrosepticum SCRI1043, P. carotovorum subsp. carotovorum PC1, P. carotovorum subsp. carotovorum PCC21, P. wasabiae SCC3193 and P. wasabiae WPP163 were downloaded by the NCBI website. Through the sequence analysis, a large number of main pathogenic genes could be obtained. According to the annotated functions of genes, we predicted the relative genes coding the toxins, PCWDEs and six secretion systems for these five pathogenic bacteria and JG10-08.

Results and discussion

General genomic features

The genome of P. atrosepticum JG10-08 contains a 5,004,926 bp long circular chromosome with no plasmid (Fig. 1). The average G+C content of the genome is 51.15%, which is similar to that of P. wasabiae (50.53%), but is lower than that of P. carotovorum (52.02%) (Table 1).

Fig. 1
figure 1

Circular representation of the P. atrosepticum JG10-08 genome. The outer scale shows the genome sequences. From the outside to inside circle, circle 1 and 2 indicate COG annotated coding sequences. Circle 3 shows KEGG enzymes. Circle 4 shows RNA genes. Circle 5 represents the GC content (%) of the P. atrosepticum JG10-08, and GC skew are shown in circle 6

Table 1 The genomic features of sequenced genomes of Pectobacterium strains

A total of 4672 CDSs were annotated on the P. atrosepticum JG10-08 genome using Glimmer 2.0. This finding indicated that the predicted CDSs account for 86.1% of total genome length (average, 941.9 bp). These annotated genes are transcribed in the positive and negative directions from the perspective of the direction of DNA replication, respectively. The P. atrosepticum genome encodes 76 tRNA operons and 39 rRNA genes, which are distributed in seven sets of 16S-26S-5S rRNA operon regions.

To further determine the difference in functions encoded by these 4672 genes, we analyzed the data using clusters of orthologous group of proteins (COGs). Our results revealed that 4252 (91%) predicted genes of P. atrosepticum were assigned by the COG categories (Table 2). Among these assigned genes, 43.27% of the genes are related to metabolism, 21.05% to cellular processes and signaling, and 16.56% to information storage and processing. However, 19.12% of the genes cannot be assigned in COG categories because their features and functions remain obscure. The COGs category is extremely essential for classifying and illustrating the annotated gene data of a complete genome as evidenced by predicting the functions of protein families (Tatusov et al. 1997).

Table 2 Distribution of genes associated with the 25 general COG functional categories

Comparison of the genome sequences of P. atrosepticum JG10-08 with those of other Pectobacterium spp

To examine the relationships of P. atrosepticum JG10-08 with previously-sequenced strains within Pectobacterium genera, we downloaded the genome sequences of five other Pectobacterium strains and one Y. pestis strain from NCBI, and performed the phylogenetic analysis (Fig. 2). Phylogenetic tree based on the whole genome revealed that P. atrosepticum has the closest relationship with P. wasabiae. The host range and survival conditions of P. carotovorum are more extensive than those of P. atrosepticum and P. wasabiae, probably leading to distant association with them. Unexpectedly, the relationships among different species in Pectobacterium based on phylogenetic analysis of 16S rRNA remains inaccurate (Fig. 3) because the strain SCC3193 was more distantly related with the strain WPP163. Previous findings based on the comparison of proteomes of all sequenced soft rot bacterium (Nykyri et al. 2012) showed that SCC3193 previously classified into P. arotovorum is grouped into P. wasabiae. Our result based on phylogenic tree of the genome sequences is in line with this finding. Thus, phylogenetic analysis at the whole genome level provided a strong support and an accurate classification for the species.

Fig. 2
figure 2

Phylogenetic relationship of P. atrosepticum JG10-08 genome sequence with those of five other Pectobacterium strains and one Y. pestis strain. P. atrosepticum JG10-08 is closely related with P. atrosepticum SCRI1043

Fig. 3
figure 3

Phylogenetic relationship of P. atrosepticum JG10-08 based on 16S rRNA gene with those of five other Pectobacterium strains and one Y. pestis strain. P. atrosepticum JG10-08 is closely related with P. atrosepticum SCRI1043

To further compare the genome structures of these sequenced strains within Pectobacterium genera, the whole genome sequences were compared using Mauve. We found that the location of genes in P. atrosepticum JG10-08 was different from that of P. wasabiae WPP163 and P. carotovorum PCC21. The aligned genes of P. atrosepticum JG10-08 are mostly oriented in forward direction relative to the genome sequences, highly similar to P. atrosepticum SCRI1043, while those of P. wasabiae WPP163 and P. carotovorum PCC21 are oriented in forward and reverse complementary direction (Fig. 4). In comparison to SCRI1043, we found there is no gene insertion or deletion of large fragments in P. atrosepticum JG10-08, only two loci of genes inversion occurred, as well as the frequent gene rearrangement. In addition, we calculated the sequence similarity of LCB among these four strains using Gene nees software. The genome arrangement of JG10-08 displays almost same synteny with SCRI1043 and differs by only 7.0% in the pairwise alignment. However, the JG10-08 differs by 47.7 and 49.4% from PCC21 and WPP163, respectively. The SCRI1043 differs by 48.3 and 49.1% from PCC21 and WPP163, respectively. The differences between P. wasabiae WPP163 and P. carotovorum PCC21 in the pairwise alignments are much larger than the differences between other strains. Taken together, the analysis of the pairwise alignment supports the previous finding that JG10-08 and SCRI1043 belong to the same species. Moreover, P. atrosepticum strains share more similarities with P. carotovorum than P. wasabiae strains. We assume that in the process of evolution, the genomes of strain JG10-08 and SCRI1043 are relatively stable, which results in close relationship. But, there are quite numbers of changes in LCB in P. wasabiae WPP163 and P. carotovorum PCC21 during species evolution.

Fig. 4
figure 4

Synteny analysis of P. atrosepticum JG10-08 and SCRI1043, P. wasabiae WPP163 and P. carotovorum PCC21 genomes. Pairwise alignments of genomes were generated using Mauve. The sequence similarity in the pairwise alignment of P. atrosepticum JG10-08 and SCRI1043 was 93.0%. The similarity between P. atrosepticum JG10-08 and P. carotovorum PCC21 was 52.3% and between P. atrosepticum JG10-08 and P. wasabiae WPP163 was 50.6%. The sequence similarity in the pairwise alignment of P. atrosepticum SCRI1043 and P. carotovorum PCC21 was 51.7%, between P. atrosepticum SCRI1043 and P. wasabiae WPP163 was 50.9% and between P. wasabiae WPP163 and P. carotovorum PCC21 was 46.3%

Pathogenic factors

Total 168 genes associated with pathogenicity were identified in the genome of P. atrosepticum JG10-08. Among them, 25 genes encoded plant cell wall degrading enzymes (PCWDs), 22 genes were related to toxins and 121 genes were involved in secretion system (Table 3). Statistics analysis revealed that the number of genes encoding toxins among these three species of bacterium is similar. However, there was an extremely significant divergence in the number of genes encoding PCWDEs. Moreover, the number of genes involving in secretion system and pathogenicity in three species was relative in 1% confidence interval. However, there was significant difference between P. carotovorum (a) and P. atrosepticum (b) as well as between P. carotovorum (a) and P. wasabiae (b) at 5% different level. The number of total pathogenic genes and genes involving in secretion of P. carotovorum was notably higher than that of P. atrosepticum and P. wasabiae (Table 4), which are consistent with the above observations on genetic relationship among these three species.

Table 3 The number of pathogenic genes distributed in six Pectobacterium strains
Table 4 The significant difference analysis in the total number of pathogenic genes of Pectobacterium genus

Genes encoding toxins

Compared with the P. atrosepticum SCRI1043, nine specific virulence genes were identified uniquely to P. atrosepticum JG10-08, including YafW, YefM, YkfI, YoeB, RelE, RelB, StbD, StbE and Phd. It has been reported that the toxin proteins encoded by YafW, YefM, YkfI and YoeB are able to directly induce plant diseases (Christensen et al. 2004; Harrison et al. 2009). The RelE, RelB, StbD and StbE genes play a key role in replication and stability of toxins (Takagi et al. 2005; Li et al. 2009; Unterholzner et al. 2013). The Phd gene encodes toxin protein, which is associated with the suppression of host defense system. Although the functions of these nine genes remain to be determined, our data provide a framework for investigating the pathogenic system of P. atrosepticum.

Plant cell wall-degrading enzymes are essential for pathogenesis of P. atrosepticum

The P. atrosepticum JG10-08 genome contains number of PCWDE genes whose products are released to extracellular space of host. These proteins play a crucial role in three distinct pathogenic ways including degradation, nutrition and feedback regulation (Yang et al. 2007). The pathogens benefit from the nutrients produced after degradation, and these degradation products accumulated in the host can induce bacterium to generate more enzymes. Therefore, the production of PCWDEs is the hallmark to soft rot pectobacteria infection. We identified a total of 25 known or putatively related genes encoding pectinases, cellulases and proteinases.

Pectinesterase plays a key role in the pathogenicity of P. atrosepticum. It can utilize the intermediate layer and pectin in cell wall to cause the death of the host tissues. Interestingly, there are two more pxo-polygalacturonate lyase-coding genes in P. atrosepticum JG10-08 compared with strain SCRI1043 (Table 5). Through n-blast software we found that these two genes started from the position of 2,349,204 and 2,630,527 bp on the genome, and the length is 1706 and 1631 bp, respectively. However, P. atrosepticum SCRI1043 carries these two genes without annotation. Thus, it is essential to determine the functions of them in the future study.

Table 5 The distribution of different kinds of pectinesterase in P. atrosepticum strain JG10-08 and SCRI1043

Endo polygalacturonate lyase (pel) secreted by bacteria is one of the most crucial virulence factors for plant cell wall degradation. The pel gene cluster consists of three different genes including pelA, pelB and pelC in P. atrosepticum JG10-08, which is the same as that in P. atrosepticum SCRI1043. The number of genes in pel clusters is significantly different among species and subspecies. For example, pelA, pelB, pelC and pelD were carried in Pcc, the pel cluster in Dickeya dadantii contains more than five genes (pelA, pelB, pelC, pelD and pelE), as well as at least four secondary genes (pelI, pelL, pelZ and pelX). Moreover, secondary pel genes are found to be involved in the protein expression. It is likely that the host range of Pa is single. But, infected plants by Pcc and Dickeya dadantii range widely.

Secretion system

In gram-negative bacteria, six types of secretion systems including I–VI are used by bacteria to export many extracellular enzymes and abundant of effector proteins (Fig. 5) (Lory 1998), which are also conserved in P. atrosepticum JG10-08. Through the secretion systems, effectors can be transported inside the plant cell where they can manipulate the host processes to facilitate bacterial infection (Holst et al. 1996). Type I secretion system (T1SS) mainly contains TolC, HlyB and HlyD proteins, and ABC systems (Davidson and Chen 2004). Type II and V secretion systems depend on Sec systems and known as a two-step transportation. Previous studies have proved that pectinases and cellulases are secreted via the type II secretion system (T2SS), and its inactivation renders Pectobacterium virulence (Reeves et al. 1993; Sandkvist 2001; de Chial et al. 2003; Filloux 2004). Type II system, mainly composed of common secretion and Sec proteins encoded by 15 Gsp gene clusters, is essential for the bacteria. AidA gene (3110 bp) is contained both in P. atrosepticum JG10-08 and SCRI1043. However, there is no evidence that this gene has been found in the other species in Pectobacterium based on nblast comparison. Type V system contains the auto transporter and two-partner secretion system (Henderson et al. 1998), and further experiments are needed to determine the predictable function of AidA gene related to adhesion in JG10-08. Bacteria also possess type VI system, which is recently discovered as a new secretion mechanism. Previous findings showed that two genes vasK and vasH involving in type VI secretion systems expressed during infection of potato tubers and the virulence of the mutants without these two genes is significantly reduced (Boyer et al. 2009; Leiman et al. 2009; Silverman et al. 2011).

Fig. 5
figure 5

The overview of secretion systems utilized by bacterium adapted from. HM host membrane, OM outer membrane, IM inner membrane, OMP outer membrane protein, MFP membrane fusion protein. ATPases was shown in yellow parts (Tseng et al. 2009)

The Hrp gene cluster encoding type III system is the most important feature in strain JG10-08. Compared with stain SCRI1043, JG10-08 shares some similarities in sequences. However, there are several loci containing the gene translocation (Fig. 6). In addition, the HrpI and HrpD gene are inverted in orientation. Moreover, the HrpY gene is absent from the genome of P. atrosepticum JG10-08. Although the gene translocation, inversion and deletion existed in the genome of P. atrosepticum JG10-08, there are no influences on gene expression and function. The reason is probably due to the fact that there are some relevant genes supplementing the functions of the deleted or disordered genes during the process of evolution. Besides of the Hrp genes, P. atrosepticum JG10-08 contains several other annotated gene clusters related to type III secretion function (Fig. 7), such as Hrc, Yes, Esc, and Spa, which is believed to contribute to the independent secretion systems and direct transportation of the effectors from cytoplasm to the cell surface (Cornelis and Van Gijsegem 2000; Jin et al. 2003).

Fig. 6
figure 6

Comparison of the Hrp gene cluster between P. atrosepticum JG10-08 and SCRI1043. Each gene is indicated by an arrow, and is ordered at the corresponding position. The length of arrow does not reflect the size of gene. The light green arrows mean reverse orientation of genes between P. atrosepticum JG10-08 and SCRI1043. A purple arrow represents the gene absent from the genome of P. atrosepticum JG10-08

Fig. 7
figure 7

Gene cluster analysis in P. atrosepticum JG10-08. Predicated genes and their orientation are shown by arrows. The length does not represent the corresponding size of gene. The arrowheads with the same color show that these genes are homologous

The type IV secretion system (T4SS) are large protein complexes that can transport DNA, large molecular proteins and nuclear-protein compounds. Agrobacterium tumefaciens includes this system encoded by VirB, Pic, Cpa, Tad and Rep genes (Aguilar et al. 2010). A virB-type IV secretion system that is best known as the Tiplasmid transferring system of A. tumefaciens (Bell et al. 2004b) is conserved in P. atrosepticum. However, VirB7 gene is deleted in P. atrosepticum SCRI1043, and VirB7 and VirB3 genes are both deleted from P. atrosepticum JG10-08. Previous studies revealed that the lack of VirB3 and VirB7 genes is common to lots of bacterium. In addition, JG10-08 harbors the Pil gene (Fig. 8) involving in the assembly of pilin, which plays an important role in early infection and colonization (Henderson et al. 1998). It has been documented that nearly 40 Pil genes are identified in Pseudomonas aeruginosa and some of these genes contribute to virulence (Ishimoto and Lory 1992; Alm et al. 1996; Shikata et al. 2016), which strongly suggested that these genes in JG10-08 have the potential virulence functions. Thus, the presence of the type IV system in the P. atrosepticum JG10-08 genome implies that this system is required to infect the plants.

Fig. 8
figure 8

Predictions of the Pil gene cluster in P. atrosepticum JG10-08. Predicated genes and their orientation are shown in arrows. The length of arrow does not represent the corresponding size of gene. Green arrows represent PilA, PilB and PilC, which are also present in P. atrosepticum SCRI 1043

Conclusion

The GC content of JG10-08 is similar to that of P. wasabiae while slightly lower than that of P. carotovorum. The comparative genomic analysis based on whole-genome sequences of P. atrosepticum JG10-08 with P. wasabiae and P. carotovorum provided significant insights into their relationships in the process of evolution. Our results suggested the genes involving in metabolism account for a predominant proportion of all annotated genes, which reflects a high degree of adaptation and interaction of P. atrosepticum with the host. In addition, our studies highlight the importance of the bacterial secretion systems for pathogenesis, and indicate that bacteria are capable of synthesizing and transporting virulence factors. All in all, the results of this research will serve as an important source for the development of biological measures to control blackleg of potatoes.