Bacillus cereus, a member of B. cereus sensu lacto group [32], is an opportunistic foodborne pathogen that causes diarrhea, vomiting, and nausea [5, 8]. It has been isolated from many foods and food additives such as milk, cheese [1, 33], cereals, rice [23], dried red pepper [13], and fermented foods [22, 30, 31], with contamination levels of higher than 104 CFU/g [9], which is not acceptable by regulatory authorities [11, 17].

Bacteriophage JBP901, isolated from Korean fermented food, exhibited a lysis profile specific to B. cereus group species, such as B. cereus, B. thuringiensis, B. mycoides, B. weihenstephanensis [26] and even B. anthracis (unpublished data). Further studies showed that JBP901 exhibited a broad host spectrum among the B. cereus isolates, being able to lyse 279 of 344 (81 %) food and human isolates without affecting the growth of B. subtilis (50 food isolates) and B. licheniformis (12 food isolates), which are necessary for beneficial fermentation [4]. In this study, we sequenced and annotated the genome of JBP901 and provided evidence for its classification in the subfamily Spounavirinae.

Single-plaque isolation, large-scale preparation, PEG (polyethylene glycol) precipitation, CsCl gradient ultracentrifugation, and dialysis to yield a high titer of phage JBP901 have been reported previously [26]. The nucleic acid of JBP901 was manually extracted from a phage stock using a phenol-chloroform extraction protocol after DNase and SDS treatment [24]. Whole-genome sequencing of JBP901 was carried out at Macrogen Inc., South Korea, using a 454 GS-FLX system (Titanium GS70 Chemistry, Roche Life Science, Mannheim, Germany). Assembly of 97,624 reads using the Roche Newbler assembly software 2.9 yielded a single contig and an average coverage of 55× (minimum coverage of 17×) with a quality filter threshold of Q30. Geneious software v8.14 [12] was used to map total reads to the assembled genome of JBP901 to test for the presence of terminal repeats [15]. Coding sequences (CDSs) were predicted using Glimmer 3.02 [6], FgeneB [27], GeneMark.hmm 2.8 [20] and RAST 4.0 [3]. Only CDSs predicted by at least three of the above tools were selected for further analysis. The individual proteins were functionally annotated using the protein-protein BLAST (BLASTP) algorithm [2]. tRNA genes were identified using the tRNA-SE v.1.21 program [19] with default parameters. We also employed Easyfig [29] and CoreGenes 3.0 [21] to compare the genome with other similar Myoviridae phages at the nucleotide and protein level, respectively.

The JBP901 genome is 159,492 bp long with an overall G+C content of 39.7 mol%. A higher read coverage was observed in the region between nucleotides 22,000 and 29,500 using Geneious software (data not shown), suggesting the existence of a terminal redundancy region (data not shown) [34]. A total of 201 open reading frames (ORFs) (172 on the reverse strand and 29 on the forward strand) and 19 tRNAs were identified (Supplementary Fig. 1). Among the total ORFs, 41 putative proteins exhibited significant similarity (higher than 50 % amino acid sequence identity) to the functionally annotated proteins of bacteriophages in the database, and these were then classified into six functional groups (Supplementary Table 1), including structure (tail protein, tail sheath protein, major capsid precursor, prohead protease, adsorption associated tail protein, baseplate protein I, baseplate protein II, minor tail protein, tail lysin and prohead protease), nucleotide metabolism (thymidylate synthase, methyltransferase, C-5 cytosine-specific DNA methylase I, II & III, nucleotidyltransferase, ribonucleoside-diphosphate reductase alpha & beta subunit, and deoxyuridine 5’-triphosphate) DNA replication and transcription (DNA primase, DNA-binding protein, exonuclease I & II, DNA helicase, RecA-like recombinase, ssDNA binding protein, and DNA polymerase I & II, transcriptional regulator I & II), DNA packaging (terminase large subunit and portal protein), host lysis (cell wall hydrolase/autolysin and holin), and others (PhoH family protein, metallo-beta-lactamase superfamily protein, FtsK_SpoIIIE-family protein, RNA polymerase sigma-70 factor and metallo-dependent phosphatase protein).

It is generally accepted that phages for biocontrol must possess a broad host range, be strictly lytic, and lack genes responsible for virulence, or those that may enhance the pathogenic properties of the host [10]. Genomic sequence analysis confirmed that JBP901 does not contain integrase or virulence-related proteins. However, ORF061 was annotated as a metallo-beta-lactamase superfamily protein that contains a conserved domain for the lactamase_B family. Considering the low amino acid sequence similarity (approximately 27 % identity with 62 % query coverage) to metal-dependent hydrolase in B. cereus (GenBank taxon: 1396), its functional identity remains unclear and needs to be studied further.

Previously, JBP901 was tentatively classified as a member of the SPO1-like group phages of the family Myoviridae, based on its morphological features [26]. However, genome analysis revealed that JBP901 lacks dUMP hydroxymethylase (deoxyuridylate hydroxymethyl-transferase), a signature gene (Gp 29 in SPO1) of SPO1 phage [28]. In addition, the top-scoring hits in BLAST analysis (BLASTP) of JBP901 ORFs are from Bcp1 [25], Bc431v3 [7], and BCP78 [18] genomes (Supplementary Table 1), not from SPO1-like or Twort-like phages. Likewise, when the whole nucleotide sequence of JBP901 was compared to the large genomes of Myoviridae phages using Easyfig [29], a shared synteny was found with the phages Bcp1, BCP78, Bc431v3 and JP901 (Fig. 1A) but not with SPO1 and Twort (Fig. 1B).

Fig. 1
figure 1

(A) Genome comparison of phage JBP901 and the related phages Bcp1 (NC_024137), Bc431v3 (JX094431), BCP78 (JN797797), phiAGATE (NC_020081), and Bastille (JF966203). The figure was constructed with Easyfig v 2.1. (B) Genome comparison of phage SPO1 (NC_011421), Twort (AY954970), and BCP8-2 using Easyfig v 2.1. The gray region shows sequence similarity

CoreGenes analysis using a BLASTP threshold score of 75 as described previously [16] showed that JBP901 shared a high proportion of its proteome with Bc431v3 (the percentage of proteins shared was 89.6 %), Bcp1 (84.1 %) and BCP78 (67.2 %) but a low proportion with Twort (33.3 %) or SPO1 (29.9 %). When pairwise comparison among the three most similar phages, Bc431v3, Bcp1 and JBP901, was performed using CoreGenes, 163 genes were similar for all three phages, while 34, 17 and 9 genes were shared only between Bcp1 and Bc431v3, between Bc431v3 and JBP901, and between Bcp1 and JBP901, respectively (Supplementary Fig. 2). Twenty-three, 24 and 12 genes were identified to be unique to Bcp1, Bc431v3 and JBP901, respectively (Supplementary Fig. 2).

A maximum-likelihood tree (ML) (Supplementary Fig. 3) drawn with either the putative major capsid proteins or tail sheath proteins of JBP901 and other phages (including all ICTV classified Spounavirinae phages) supported that JBP901 is not a member of the SP01-like phages. Taken together, our data support the presence of a distinguishable group in the subfamily Spounavirinae, different from the SPO1-like phages under the current ICTV [14] classification scheme for the family Myoviridae.

Nucleotide sequence accession number

The whole genome sequence of phage JBP901 was deposited in the GenBank database with the accession number KJ676859.