Introduction

Pseudomonas fluorescens is a physiologically diverse species of bacteria that are found in soil, rhizosphere, living plants and water, as well as in contaminated human blood products and respiratory samples. Some P. fluorescens isolates benefit plants by their capacity to colonize plant surfaces and producing antibiotics toxic to target pathogens [1]. In contrast, some isolates are recognized as an opportunistic pathogen to humans [2]. Another isolates were documented with the ability to negatively affect growth of plants [3] or be pathogenic to plants [4, 5].

To date, the genome sequences were generated for several plant-associated isolates of P. fluorescens, most of them were found to be functional as biological control agents to suppress plant disease [1]. Although the genome of a strain GcM5-1A of P. fluorescens associated with pine wilt disease was recently reported [6], the genome sequences associated with plant-pathogenic isolates still remain limited, and the virulence factors and evolutionary relationships of P. fluorescens are poorly understood. In our previous study, P. fluorescens strains were frequently isolated from the infected leaves and twigs of symptomatic kiwifruit plants [7], which resembled symptoms of bacterial canker disease mainly caused by P. syringae pv. actinidiae [8, 9]. When inoculated on the detached stems and twigs of potted seedlings of kiwifruit, P. fluorescens AHK-1 caused necrosis around the inoculated sites, and this strain was confirmed to be pathogenic on kiwifruit [7].

In order to acquire abundant molecular information to explore in greater detail the genetic characteristics of pathogenic isolates of P. fluorescens, the whole genome of P. fluorescens AHK-1 was sequenced. In the present study, we report the draft genome sequences of P. fluorescens AHK-1, compare the genome of AHK-1 with those of the other two sequenced P. fluorescens isolates, and identify putatively relevant virulence factors based on sequence annotation. The data presented in this work may lead to better understanding of the molecular basis of pathogenic and non-pathogenic isolates within P. fluorescens group.

Materials and Methods

Strains and DNA Extraction

The cultures of P. fluorescens AHK-1 were stored in 30% (v/v) glycerol at -80 °C in this study, and deposited in China Center for Type Culture Collection (CCTCC) with accession number CCTCC AB 2018073. The strain AHK-1 was shake-cultured in Luria–Bertani (LB) medium at 28 °C for 24 h, and bacterial cells were harvested by centrifugation at 5000×g for 10 min at 4 °C. Genomic DNA was extracted from the bacterial pellet of P. fluorescens AHK-1 using the NEB Bacterial DNA kit according to manufacturer’s instructions. DNA purity was examined by 0.8% agarose gel electrophoresis, and the concentration was measured using a Qubit® 2.0 Fluorometer.

Genome Sequencing and Assembly

The whole genome sequencing of P. fluorescens AHK-1 was performed using the Illumina HiSeq 2500 platform (Illumina, USA) at the location of a sequencing service provider (Shanghai biotechnology corporation, Shanghai, China). The program FastQC was used to assess sequencing quality. The Q value of sequencing quality was used for evaluation, the relationship of Q value and sequencing error E-value is Q = − 10log10E. We used Trimmomatic software [10] to filter raw reads, remove the adapter and low-quality sequences, which included the reads with ambiguous nucleotides (Q value ≤ 20) and short (≤ 45 bp) reads. The trimmed reads were de novo assembled with the SPAdes-3.5.0 software [11]. All of generated contigs were submitted to BLAST against the nr database using BLASTX [12]. Then the obtained contigs are aligned to the available genome sequence of P. fluorescens SBW25 with Mummer software [13] to be ordered.

Genome Annotation

The annotation file in Swiss-Prot library (http://www.expasy.ch/sprot/ and http://www.ebi.ac.uk/swissprot/) and the software Prokka [14, 15] were used to predict the function of genes. Then the predicted genes were annotated by the Cluster of Orthologous Groups (COG) database (http://www.ncbi.nlm.nih.gov/COG) with an E-value cut-off of 1.0E-5. The predicted genes were classified by Blast2GO algorithm [16], to obtain the number of cataloged groups in Gene Ontology (GO) based on molecular function, cellular component and biological processes. Furthermore, pathway assignments were performed based on the online Kyoto Encyclopedia of Genes and Genomes (KEGG) Automatic Annotation Server (http://www.genome.jp/tools/kaas/) [17]. The tRNA genes were predicted using tRNAscan-SE 1.23 [18], rRNA genes using RNAmmer 1.2 [19] and sRNA using Infernal 1.1 [20]. Moreover, the candidate virulence factors were searched among all predicted genes in AHK-1 according to the methods as described previously [6].

Comparative and Phylogenetic Analysis

To determine the variation in genome content and organization between AHK-1 and other sequenced isolates of P. fluorescens, the genome of AHK-1 was compared to the two representative genomes of P. fluorescens GcM5-1A [21, 22] and SBW25 [23] by using a multiway BLASTp analysis. Genes present exclusively in an individual strain and those shared between two or three strains were counted by using Mauve software, and represented in Venn diagrams generated by Venn Diagram in R-platform [24]. The phylogenetic relationship between P. fluorescens AHK-1 and other representative isolates of P. fluorescens published previously was analyzed with Molecular Evolutionary Genetics Analysis (MEGA) software, based on (1) 16S rRNA and (2) concatenated sequences of nine highly conserved housekeeping genes including acsA, aroE, guaA, gyrB, mutL, ppsA, pyrC, recA, and rpoB [1, 2]. The Escherichia coli strain K-12 was used as outgroup. The information on the reference strains selected are shown in Table S1.

Nucleotide Sequence Accession Numbers

The Whole Genome Shotgun sequence project of P. fluorescens AHK-1 has been deposited at GenBank under the accession number QRBA00000000. The BioProject designation for this project is PRJNA473300.

Results and Discussion

Genome Assembly and Annotation

A total of 7,110,161 raw reads and 7,000,101 clean reads with total bases 1988 Mb were generated by Illumina paired-end sequencing. The draft genome of P. fluorescens AHK-1 was based on an assembly of 50 scaffolds amounting to 7,035,786 bp, with a G+C content of 60.88%. The largest scaffold was 626.42 kb and the N50 size was 331.26 kb. A total of 6327 protein-coding genes were predicted in the genome of AHK-1, and four rRNA operons, 58 tRNA loci were detected in the genome of this strain (Table 1). Of the 6327 predicted genes, 5024 (79.41%) were assigned to COGs, and 2418 genes (38.22%) in KEGG databases, respectively.

Table 1 Genomic features of P. fluorescens AHK-1

As to the COG functional categories (Fig. S1), 13.18% (662) of the total genes were involved in “amino acid transport and metabolism” with higher proportion, and followed by 12.96% (651), 11.21% (563), and 8.92% (448) of the total genes, which are associated with “general function”, “transcription”, and “function unknown”, respectively. The metabolic pathway analysis using KEGG orthology revealed 85 metabolic pathways. Using Blast2Go analysis, 1157, 3759, and 3482 genes participated in cellular components, molecular functions and biological processes, respectively (Fig. S2). Within the molecular functions, “catalytic activity”, “transporter activity”, and “binding” were highly represented, “cellular process”, “metabolic process”, and “single-organism process” were the most represented GO categories within the biological process. A high percentage of genes were classified as “cell” and “cell part” under the cellular components category.

Comparative Genomic Analysis

The presence of the P. fluorescens AHK-1 orthologous coding sequences in the two genomic sequences from P. fluorescens SBW25 and GcM5-1A were assessed (Fig. 1). Within these isolates, 3998 genes formed the core and accounted for 64.87%, 76.62%, and 68.64% of the total genes in the AHK-1, GcM5-1A, and SBW25 genomes, respectively. On the other hand, 946 strain-specific genes for P. fluorescens AHK-1 was observed, which may contribute to species-specific features of this bacterium. Among them, 80.13% (758) of the genes are classified into 20 COG functional categories accounting for “general function prediction only” (11.95%), “transcription” (11.10%), “amino acid transport and metabolism” (10.36%), “signal transduction mechanisms” (8.46%), and other functions with smaller proportion. The remaining 188 unique genes (19.87%) are not classified into any COG categories (Table 2).

Fig. 1
figure 1

Venn diagram comparing the gene inventories of three P. fluorescens isolates AHK-1, SBW25 and GcM5-1A. The numbers of shared and unique genes are presented

Table 2 The number of strain-specific genes of P. fluorescens AHK-1 associated with the COG functional categories

Moreover, each genome analyzed in this study has 690 to nearly 1000 unique genes when compared to each other, suggesting a high variation in diversity of genome content and heterogeneity in genome organization. This is consistent with the results of the comparison of four other P. fluorescens genomes (WH6, SBW25, Pf0-1 and Pf-5) [3]. There are more predicted genes in AHK-1 than in any of the two sequenced P. fluorescens isolates. In addition, the percentage of shared orthologous groups between AHK-1 and SBW25 was 80%, between AHK-1 and GcM5-1A was 68%, and between SBW25 and GcM5-1A was 71%. This lower percentage of shared orthologous groups is consistent with those previously observed between the isolates of P. fluorescens [6, 25].

Phylogenetic Analysis

The phylogenetic tree was generated using the maximum likelihood (ML) algorithm in MEGA based on multilocus sequence analysis (Fig. 2). It is apparent that the 18 previously sequenced isolates of P. fluorescens and strain AHK-1 fall into a single large clade composed of two Sub-clades. Totally 17 P. fluorescens isolates fall into one clade, with isolate AHK-1 more distantly from others. The second clade is composed of P. fluorescens Pf0-1 and P. fluorescens R124. These results are reasonably consistent with a maximum likelihood phylogeny based on 16S rRNA (Fig. S3). Moreover, based on the observation of the phylogeny in this study, it is consistent with those results obtained from recent phylogenetic studies in which Pf0-1 represents a distinct clade clearly distinguished from SBW25 in P. fluorescens group [1, 6].

Fig. 2
figure 2

Phylogenetic tree showing the relationship of AHK-1 with other representative isolates of P. fluorescens. The tree was generated by MEGA5 using the maximum likelihood method based on concatenated sequences of nine core housekeeping genes including acsA, aroE, guaA, gyrB, mutL, ppsA, pyrC, recA, and rpoB. Escherichia coli strain K-12 was used as outgroup. Bootstrap support for nodes (r = 1000) were shown above 50. The scale bar indicates the number of the nucleotide acid substitutions per site

Virulence Factors

To provide a preliminary view of the genes involved in pathogenesis, we identified 72 predicted genes as candidate virulence factors of P. fluorescens AHK-1 through BLAST against the VFDB [26] (Table S2). Among these factors, the fliA gene encoding flagellar biosynthetic protein participate in flagellar motility, which is critical for the colonization of respective hosts by many bacterial pathogens. The sigma factor fliA is found to repress the quorum-sensing controlled transcriptional regulator HapR and allow increased expression of virulence factors in Vibrio cholera [27]. The inactivation of fliC gene, encoding flagellin biosynthesis, can result in increased activity of the toxin in cell culture supernatants of Clostridium difficile [28]. The fliC gene was observed in the genome of P. fluorescens GcM5-1 and considered as a crucial pathogenic factor in this strain [6]. However, BLAST searches failed to detect homology to fliC against VFDB with all of the predicted genes in AHK-1. We did not find a highly homologous fragment of fliC in the genome of AHK-1. Furthermore, several secretion systems in AHK-1 were also identified, including type VI, III, and IV secretion systems. The type III secretion system (T3SS) was considered to be necessary for full virulence of pathogenic bacteria [29, 30]. The genes for a complete and functional T3SS system were recently identified in the genomes of P. fluorescens GcM5-1A and WH6, respectively [3, 6], suggesting that T3SS may act as an important role in subverting and colonizing their hosts in the bacterium. In contrast, P. fluorescens AHK-1 was found to solely possess partial orthologs of genes encoding components of T3SS when compared with P. fluorescens GcM5-1A and P. syringae pv. tomato DC3000. One possible explanation for that is several T3SS-encoding CDSs were missing in the genome of AHK-1 because of the draft assembly described in this work. Furthermore, the type VI secretion system (T6SS) is a recently discovered virulence mechanism utilized by Gram-negative bacteria [31], and the VipA/VipB has been shown to play key roles in virulence of clinically important pathogens including V. cholera and P. aeruginosa [32]. The essential proteins for T6SS, VipA/VipB, were detected in the genome of AHK-1 in this work, suggesting which may function to be an important role in virulence mechanism of the strain.

In conclusion, we characterized the genome of P. fluorescens AHK-1 isolated from infected leaves of kiwifruit, which can be pathogenic and involved in kiwifruit bacterial disease. Compared to SBW25 and GcM5-1A, AHK-1 contained more strain-specific genes involved in transcription, amino acid transport and metabolism, and signal transduction mechanisms. The detected candidate virulence factors in AHK-1 provide valuably informative clues for addressing the interacting with its host. Additionally, the draft genome sequence will serve as a reference for the analysis of P. fluorescens isolates associated with plant disease.