Introduction

Caldibacillus debilis strain GB1 was isolated from a marsh contaminated by seasonal manure run-off from a cattle barn (Wushke et al. 2013). Physiological comparison of the type strain C. debilis DSM 16016 against C. debilis GB1 (Wushke et al. 2015) revealed several distinct phenotypic differences, including differences in yields of end-product amounts, differing amounts of cell lysis in stationary, and the ability to grow anaerobically. Genomic and proteomic characterization of C. debilis GB1 was conducted previously with a focus on core metabolism (Wushke et al. 2017).

Phage – host interactions are important in affecting microbial physiology, biogeochemical and ecological processes, biofilm formation in the environmental communities, and horizontal gene transfer (Bohannan and Lenski 2000; Ochman et al. 2000; Sutherland et al. 2004; Weitz and Wilhelm 2012). Within the domain Bacteria, thermophilic phages represent a novel area of study, with only a handful isolated from thermophilic Firmicutes, such as bacteriophages GBVS1, D6E, GVE1, GVE2, with only GVE2 being lytic (Doi et al. 2013; Wang and Zhang 2010; Liu et al. 2006). There have been few thermophilic tailed Siphoviridae bacteriophages characterized within the literature (Liu et al. 2009; Liu and Zhang 2008; Doi et al. 2013). Thermophilic Firmicutes with thermophilic (pro)phage genomes are rarely reported, described, and characterized in the literature (Sidhu 2000).

Caldibacillus debilis may have uses in biofuel applications (Wushke et al. 2015). Part of evaluating an organism for industrial usefulness is determining its genomic stability. In general, genomic stability is negatively affected by mobile genomic elements like integrative phages and transposons (Ochman et al. 2000). Finding, describing, and understanding potential genome destabilizing elements are a key to evaluate an organism for industrial usefulness.

Bioinformatic analyses identified putative genes encoded by the Siphoviridae-like bacteriophage CBP1 within the C. debilis genome. We have characterized gene cassette of CBP1, acryptic (pro)phage detected within the C. debilis GB1 genome, and demonstrated that some gene products are actively expressed during early exponential phase of cell growth.

Methods

Cell growth

For all experiments, C. debilis strain GB1, DSM 29516 (Wushke et al. 2013) was grown on cellobiose in a modified 1191 medium (Islam et al. 2006), with a lower concentration of yeast extract (0.76 g/L) and with the initial pH adjusted to 7.2 and at a temperature of 60 °C. Aerobic environments in Balch tubes were prepared as previously described by Wushke et al. (2015, 2017). Plating was conducted using cellobiose as a substrate on modified 1191 media as previously described by Wushke et al. (2015, 2017). C. debilis strain Tf, DSM16016, was grown using the same media and conditions as C. debilis GB1 and used to create a lawn for plaque assays.

Plaque assay

Caldibacillus debilis GB1 was grown to stationary phase(24 h post inoculum). All plating was done in triplicate. C. debilis GB1 supernatant (0.1 mL) was filtered – sterilized (0.22 μm filter) and mixed with liquid 1% agar m-1191 inoculated with 1% C. debilis Tf DSM 16061 and incubated for 10 min at 60 °C to allow phage absorption. Liquid 1% agar m-1191 was then poured into plates and allowed to solidify. These plates were then incubated for 12, 24 and 48 h. In an attempt to induce the lytic cycle, half of the plates were exposed to UV light for 30 s after 12 h of incubation using the UV light in the biosafety cabinet (1300 Series A2 from Thermo Scientific). The UV-induced plates were put back in the incubator and checked every 3 h until 48 h for the presence of plaque formation.

Genomic analysis

The C. debilis GB1 genome description was published previously (accession number: AZRV00000000; Wushke et al. 2017). The genome sequence of wild-type C. debilis (strain Tf) is available at the NCBI database (accession number: ARVR00000000). PHAST allowed identification of putative integration sites as well as phage-associated open reading frames (ORFs), which typical bacterial annotation pipelines may not interpret correctly (Zhou et al. 2011). Complete cryptic (pro)phage genomes in C. debilis GB1 and C. debilis Tf were identified using the Phage Search Tool (PHAST) (Zhou et al. 2011).

Proteomic analysis

The proteome of C. debilis GB1 under aerobic and anaerobic growth conditions was previously described by Wushke et al. (2017). The methods used to extract, purify, and analyze the proteome are also described by Wushke et al. (2017). The total ion current (TIC) for each protein under aerobic and anaerobic conditions were pooled and the value expressed as Log2(TIC) as shown in Table 1. Data generated by that study were further analyzed with a focus on bacteriophage-associated genes. These methods and analysis were applied to the CBP1 (pro)phage genome (accession MF595878).

Table 1 Annotation of the CBP1 genome

Results

Identification of phage genomes in C. debilis GB1

Analysis of the C. debilis GB1 genome (Wushke et al. 2017) revealed several distinct regions where exogenous DNA was integrated into the main bacterial chromosome. Due to the significant differences in physiology between C. debilis GB1 and C. debilis Tf, and the presence of significant amounts of exogenous DNA in the C. debilis GB1 genome, an attempt was made to identify and characterize any (pro)phage(s) using the ‘omics information previously generated for C. debilis GB1.

One potential whole phage genome appeared to be integrated in the C. debilis GB1 genome (Table 1). This cryptic prophage was designated CBP1. The chromosomal region containing the cryptic prophage was putatively identified within Contig 16 (bp position 43,270–80,585). Analysis of Contig 16 with the Phage Search Tool (PHAST) identified a cryptic prophage region of ~ 37,315 bp with a GC content of 42% (Fig. 1), distinct from C. debilis GB1 which has a GC content ~ of 51%. The CBP1 genome was submitted to NCBI under accession number MF595878.

Fig. 1
figure 1

A schematic representation of the cryptic prophage CBP1 genome on Contig 16 (host DNA omitted) produced by IMG. The directions of the arrows indicate the putative direction of transcription (white: unknown function, red: COG X, blue: COG F, light tan: COG L, green: COG O)

Plaques

Caldibacillus debilis DSM 16016 was used as a lawn in an attempt to isolate plaques as genome analysis revealed no phage/prophage genomes were present. No zones of clearing were observed when plaque assays were done using GB1 supernatant from late stationary. Plates appeared to grow as robustly as when no GB1 supernatant was added. When GB1 was used to create a lawn, UV was used in an attempt to induce the phage lytic cycle; treatment with UV did not result in plaques forming or lyses of the lawn.

Analysis of the CBP1 genome

Many of the putative ORFs encoded by the CBP1 genome had the highest nucleotide sequence identity to another thermophilic Siphoviridae genome, GBVS1, found in Geobacillus sp. 6k51 (Liu et al. 2009). A comparison of the GBSV1 and CBP1 genomes revealed regions with greater than 30% nucleotide average sequence identity per 100 bp (Figure S1). A total of 69 putative ORFs (Table 1) were identified in the CBP1 genome using PHAST, 62 of which were transcribed in the same direction, and 8 in the opposite direction. Twenty-seven (27) of the ORFs most closely matched the bacterial or genebank database and 42 ORFs most closely matched the viral and prophage database at the protein level. Twenty-three (23) of the ORF’s protein sequences showed extreme variability compared to the databases with e values of 0. Forty (40) of the putative ORFs top tblastn hits were to other phages with Gram positive bacteria (Table 1), with many matching ORFs in GBVS1. Moreover, ORFs 45–55 were functionally syntenic between CBP1 and GBVS1 upon inspection of PHAST results.

Thermophilic Siphoviridae phages are presumed to survive and replicate via both lytic and temperate cycles (Liu and Zhang 2008). CBP1 appears to have all the necessary functions encoded for a lytic/temperate life cycle including gene homologues for DNA replication (ORF 21, 23), DNA recombinase/invertase (ORF 66, 67, 69), lytic genes (ORF 41, 61,62), phage capsid structural genes (ORF 40, 43, 47, 49, 55, 57), and packaging (ORF 45). The general functions as annotated by PHAST are shown in Figure S2. Several of the PHAST hits in CBP1 were similar to not only other Bacilli-related phages, but also to Clostridium-related phages. Three Clostridium-related phage genes were identified in the genome of phage CBP1: a terminase, phage portal protein, and clp protease (Table 1; ORFS 40, 41, 42). Several of the closest related phage proteins were found to be from mesophilic hosts per PHAST analysis in Table 1.

Proteomic expression of phage genes

Of the 69 putative ORFs in CBP1, 5 proteins corresponding to these ORFs were detected within the proteome (under both anaerobic and aerobic growth conditions). The genes were not equally expressed and displayed different Log2TIC scores, shown in Table 1. Notably, ORF 4 and 31 had much higher Log2TIC scores (19.42 and 20.85, respectively) than ORFs 7, 9, and 63 (11.38, 10.36, 14.4, respectively). The high Log2TIC scores of ORFs 4 and 31 were similar in expression value (Log2TIC) to those proteins expressed in central metabolism of C. debilis GB1 during growth (Wushke et al. 2017). Three of the expressed proteins (ORFs 4, 31, 63) had e-values of 0, suggesting these proteins are unique and in areas of possible hypervariability of the CBP1 genome. Expression of a transcriptional regulator (CBP1 ORF 9), a protein homologous to the bacteriophage λ repressor protein (cI), was expressed during cell growth. This is consistent with a prophage that would be repressed at the time of sampling (Ackers et al. 1982). The other 4 expressed proteins were hypothetical proteins with undetermined function as characterized by PHAST. To further characterize detected phage-associated proteins InterProScan analysis was used, shown in Table S1 (Zdobnov and Apweiler 2001). InterProScan analysis did not provide significant further insight compared to PHAST analysis. Detection of five phage-associated proteins shows that the phage genes are active in some capacity. Proteomic analyses of purified phages from Geobacillus typically only identify the phage structural proteins (Liu and Zhang 2008; Liu et al. 2009), whereas we have observed protein expression from the prophage during host growth (Wushke et al. 2017).

Discussion

The genus Caldibacillus (formerly within the genus Geobacillus) is a single species sister genus to Geobacillus (Coorevits et al. 2012). Caldibacillus and Geobacillus are both thermophilic Bacilli (Banat et al. 2004; Coorevits et al. 2012). Geobacillus have been looked at explicitly for industrial uses. Thus, identifying and characterizing potential genome destabilizing elements in a bacterial strain are important. Bacteriophages in the Family Siphoviridae are known to infect a broad range of Firmicutes including Clostridium hosts (Horgan et al. 2010), and several genes encoded by the bacteriophage CBP1 and GBSV1 genome showed their highest sequence identity to genes encoded by phages isolated from Clostridia (Hargreaves et al. 2013; Yoon and Hyo 2011). This fits with the fact that C. debilis and C. thermocellum have been found within the same environment (Wushke et al. 2013). From the PHAST analysis, three Clostridium-related phage genes were identified in the CBP1 genome, matching phages phiMMP04 and phiSM101 (Hargreaves et al. 2013; Nariya et al. 2011). These genes could assist in the life cycle of thermophilic Siphoviridae in a Clostridium host. This would support the notion that thermophilic Firmicutes act as a natural reservoir of thermophilic Siphoviridae and that they have multiple host targets (Lucchini et al. 1998; Brüssow et al. 2001). The thermophilic Geobacillus and Caldibacillus have been isolated from mesophilic environments (Banat et al. 2004; Wushke et al. 2013); it is possible this phage could be transferred between mesophilic hosts as well. Bacteriophages of this type may represent a vector for genetic transfer between these organisms. Nothing was found (or found to be omitted) at the genomic level, by our analyses, that would preclude the ability of CBP1 to form viable phage particles under the right conditions. The expression of 5 proteins in the CBP1 genome, with one expressed (ORF 9) protein appearing similar to the λ repressor protein (cI), suggests that CBP1 could be a phage in a repressed state. The function of the other 4 proteins observed is unknown, but their expression may also be associated specifically with the repressed state of the CBP1 phage.