Introduction

Sparked by the revolutionary success of thermophilic polymerases, the interest in thermophiles has increased steadily over the last 50 years such that they became among the best studied extremophiles. This interest is mainly due to their potential use as cell factories for biotechnologically important biomolecules as well as their exemplary use as models for primordial life forms. Consequently, thermophiles have been subjects for genome sequencing projects that even led to the establishment of the phylogenetic position of Archaea as the third kingdom of life (Bergquist et al. 2014).

Representatives of the genus Brevibacillus are Gram-positive, motile, red-pigmented, spore-forming, aerobic bacteria, collected from diverse environmental habitats including rocks, dust, aquatic environments, and guts of various insects and animals (Nicholson 2002). The Brevibacillus genus includes species with biotechnologically important features such as biodegradation (Ye et al. 2013), production of antibiotics (Westman et al. 2013), recombinant chicken interferon-gamma (Yashiro et al. 2001), cholera toxin B subunit‐insulin B chain peptide hybrid protein (Yuki et al. 2005), antigenic protein VP28 (Caipang et al. 2008), and single-chain antibody (scFv) (Tokunaga et al. 2013) via Brevibacillus recombinant expression systems (Tokunaga et al. 2013). From a total of 20 species, only 4, including Brevibacillus aydinogluensis (Inan et al. 2012), Brevibacillus borstelensis (Shida et al. 1995), Brevibacillus limnophilus (Goto et al. 2004), and Brevibacillus thermoruber (Manachini et al. 1985), are identified as thermophilic bacilli.

Considering the advantages associated with fast metabolism, short fermentation times, and low viscosity of the thermophilic production systems, studies were initiated to identify novel thermophiles with outstanding properties. For this, thermophiles isolated from hot spring water samples taken from Turkey and Bulgaria were screened and B. thermoruber strain 423 was found to be a very promising cell factory for microbial exopolysaccharide (EPS) production (Yasar Yildiz et al. 2014). EPSs are natural, biodegradable, and nontoxic high molecular weight polymers of sugar residues that are involved in diverse functions including adhesion, cell to-cell interactions, biofilm formation, and cell protection against environmental extremes (Toksoy Oner 2013). Due to their physicochemical, biological, and rheological properties, EPSs like xanthan, dextran, pullulan, gellan, and curdlan have found numerous applications in food, medical, bionanotechnology, cosmetic, and environmental sectors. Although industrial or technical producers of these EPSs are mainly bacteria, these microbial EPS production processes hardly compete with their cheaper counterparts utilizing plants or algae (Nicolaus et al. 2010; Toksoy Oner 2013). From this perspective, thermophiles could be considered as efficient cell factories for industrially important EPSs due to elevated production rates that result from their fast metabolic activities (Nicolaus et al. 2010). A systems-based approach to EPS production processes requires the knowledge of microbial genome sequence that in turn serves as the starting point for detailed analysis of identifying gene-protein associations and metabolic reconstruction. Considering this fact, whole genome of B. thermoruber 423 was obtained by genome sequencing performed in duplicate via high-throughput Illumina HiSeq 2000 technology (Yasar Yildiz et al. 2013). Within the context of this study, this draft genome was used for genome analysis in order to understand the real potential of this thermophilic microorganism and to develop rational strategies for the genetic and metabolic optimization of EPS production. Moreover, to our knowledge, this study is the first genome analysis performed on a thermophilic Brevibacillus species. Genome annotation revealed the essential biological mechanisms and the whole genomic organization of B. thermoruber 423. Using genome data as well as experimental evidences, a preliminary model for EPS biosynthesis was also proposed.

Materials and methods

Bacterial strain

In this study, thermophilic B. thermoruber strain 423 (NBIMCC 8836) described in our previous work (Yasar Yildiz et al. 2014) was used.

Sequencing and assembly

The genome of B. thermoruber strain 423 was sequenced via Illumina HiSeq 2000 technology where paired-end sequencing was conducted with two different insert sizes of 500 and 2000 bp (Yasar Yildiz et al. 2013). The reads were assembled via the SOAPdenovo software version 1.05 (Luo et al. 2012), using a k-mer size of 15 bp and the key parameter K setting at 71 as previously reported (Yasar Yildiz et al. 2013).

Genome annotation

Gene prediction and genome annotation were carried out using the RAST autoannotation server (http://rast.nmpdr.org/) (Aziz et al. 2008). Protein-encoding, ribosomal RNA, and transfer RNA genes were predicted by RAST server. The gene predictions of essential biosystems were manually verified using BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi) searches against protein databases, the Universal Protein Resource (UniProt) (http://www.uniprot.org/), and NCBI (http://www.ncbi.nlm.nih.gov/). The gene functions and classifications were based on the subsystem annotation of the RAST server. Information on enzyme-encoding genes was taken from Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://www.genome.jp/kegg/) (Kanehisa et al. 2012) and ExPASy (http://www.expasy.org/) databases (Artimo et al. 2012).

Phylogenetic and phylogenomic analyses

Phylogenetic classifications of the microorganisms belonging to the Brevibacillus family were based on 16S ribosomal RNA (16S rRNA) sequences of each species. For the phylogenomic comparison, the whole-genome sequence of B. thermoruber 423 (GenBank accession number ATNE00000000.1) was compared with whole-genome sequences of closest members of the Brevibacillus genus [Brevibacillus brevis NBRC 100599 (GenBank accession number AP008955.1), Brevibacillus laterosporus DSM 25 (GenBank accession number ARFS00000000.1), Brevibacillus panacihumi (GenBank accession number AYJU00000000.1), and Brevibacillus massiliensis (GenBank accession number CAGW00000000.1)]. Genome sequences were obtained from the NCBI database and annotated by the MicroScope Microbial Genome Annotation & Analysis Platform (https://www.genoscope.cns.fr/agc/microscope/home/index.php). Comparison was based on the encoded proteins with putative functions assigned by using the MicroScope Microbial Genome Annotation & Analysis Platform. Permissive parameters were used for comparison (50 % amino acid identity, 80 % amino acid alignment coverage).

Comparative genome analysis

Whole-genome sequences of a total of 11 strains from six different Brevibacillus species [B. thermoruber 423 (GenBank accession number ATNE00000000.1), B. laterosporus DSM 25 (GenBank accession number ARFS00000000.1), B. laterosporus PE36 (GenBank accession number AXBT00000000.1), B. laterosporus GI-9 (GenBank accession number CAGD00000000.1), B. laterosporus LMG15441 (GenBank accession number AFRV00000000.1), B. brevis FJAT-0809-GLX (GenBank accession number AHKL00000000.1), B. brevis NBRC 100599 (GenBank accession number AP008955.1), B. brevis X23 (GenBank accession number AKYF00000000.1), Brevibacillus agri (GenBank accession number AOBR00000000.1), B. panacihumi (GenBank accession number AYJU00000000.1), and B. massiliensis (GenBank accession number CAGW00000000.1)] were obtained from the NCBI database and annotated by the RAST server. Encoded proteins that take a role in carbohydrate utilization or EPS production mechanism were sorted out for all 11 genomes and comparatively analyzed based on the presence/absence of proteins.

Microbial growth and carbohydrate utilization

B. thermoruber 423 cells were grown in the optimized medium (OM) according to the previously described conditions (Yasar Yildiz et al. 2014). Twelve different sugars (maltose, glucose, lactose, sucrose, arabinose, xylose, raffinose, fructose, galactose, mannose, glycerol, or mannitol) were tested as a carbon source in this medium. In all these experiments, shaking cultures were incubated for 14 h at 55 °C and 180 rev/min. Cell growth was monitored by measuring the optical densities at 660 nm using Lambda35 UV/Vis spectrophotometer (PerkinElmer, Waltham, MA, USA). To convert optical density (OD) values to biomass concentration in terms of grams dry cell weight per liter, a calibration chart was prepared by a gravimetric method as described in Poli et al. (2009). To assess cellular integrity of the cultures, at certain time points, cell-free fermentation medium was analyzed for its protein and nucleic acid contents by the Bradford test (Bradford 1976) and spectrophotometric measurements at 260 nm, respectively.

For carbohydrate analysis, samples were taken at regular intervals and centrifuged at 4602 × g for 10 min to remove cells. Sugar content analysis in the supernatant was performed by Agilent 1100 High Performance Liquid Chromatography (HPLC) system with refractive index (RI) detector using the Zorbax Carbohydrate Analysis Column 4.6 × 250 mm (Agilent, Santa Clara, CA, USA). The cell-free samples were filtered through a 0.2-μm filter prior to use and then diluted with an equal volume of 50:50 acetonitrile/water solution. The flow rate was 1.4 mL/min, the mobile phase was 75:25 acetonitrile/water, and the temperature of the column was 30 °C. Carbohydrate uptake rates were calculated from the slopes of the carbohydrate utilization profiles and expressed as g/L · h. Results are averages of at least three independent experiments.

EPS production

To obtain the EPS production profiles, B. thermoruber 423 cells were grown in OM containing 10 g/L maltose, glucose, lactose, sucrose, arabinose, xylose, raffinose, fructose, galactose, mannose, glycerol, or mannitol. Shaking cultures were incubated for 14 h at 55 °C and 180 rev/min and samples were taken at regular intervals. For the isolation of EPS, cells were harvested by centrifugation at 4602 g for 10 min, and the supernatant phases were treated with an equal volume of ethanol at 4 °C, held at −18 °C overnight, and then centrifuged at 13,523 × g at 4 °C for 30 min using a refrigerated centrifuge. The pellets were dissolved in hot distilled water and analyzed for their carbohydrate content via phenol/sulfuric acid method using glucose as standard (Dubois et al. 1956). Results are averages of at least three independent experiments.

Nucleotide sequence accession numbers

The genome project for B. thermoruber 423 has been deposited at DDBJ/EMBL/GenBank under the accession ATNE00000000. This is the first version, with accession number ATNE01000000. The 16S rRNA gene sequence is available in DDBJ/EMBL/GenBank under the accession number KF192950.

Results

General features of the B. thermoruber 423 genome

The whole-genome sequencing project of B. thermoruber 423 employed Illumina HiSeq 2000 technology and resulted in clean sequence data of 603 Mb providing approximately 150-fold coverage (Yasar Yildiz et al. 2013). The gene prediction and genome annotation of draft genome of B. thermoruber 423 (Supplementary Table S1) resulted in a total of 4446 coding sequences and 112 RNA genes (Table 1). Putative functions could be assigned to 3020 protein-coding genes, whereas 1426 hypothetical proteins had no match to any known proteins. Forty-two percent of the total coding sequences could be assigned to subsystems (Supplementary Tables S2 and S3). The gene annotations revealed that the complete pathways of glycolysis, gluconeogenesis, pentose-phosphate, Entner–Doudoroff, and all de novo amino acid biosynthesis were present in the B. thermoruber 423 genome. The tricarboxylic acid (TCA) cycle and glyoxylate bypass were also complete, and genes corresponding to aerobic respiration and ethanol fermentation pathways were also identified. The biosynthetic pathways for most vitamins (thiamin, riboflavin, vitamin B6, vitamin B12, biotin, folate, phylloquinone, menadione) were also present.

Table 1 General features of Brevibacillus thermoruber 423 draft genome

Phylogenetic and phylogenomic analysis

16S rRNA sequence of B. thermoruber 423 showed 99 % identity to B. thermoruber JAM-FM2501 (GenBank accession number AB362290.1). Three other thermophilic Brevibacillus species, namely B. aydinogluensis PDF25 (GenBank accession number NR117986.1), B. borstelensis UTM105 (GenBank accession number KF952566.1), and B. limnophilus DSM6472 (GenBank accession number NR024822.1), were found to be 99, 97, and 95 % identical, respectively. Additionally, phylogenetic classification of the Brevibacillus genus that was based on the 16S rRNA sequences of the species indicated close relation of B. thermoruber 423 to the two thermophilic species B. thermoruber B4M and B. aydinogluensis DSM 6348 T (Fig. 1).

Fig. 1
figure 1

Phylogenetic tree based on 16S rRNA sequences for the genus Brevibacillus

The relationships between B. thermoruber 423 and four other members of B. thermoruber were analyzed by a phylogenomics approach, which was based on the encoded proteins with putative function assignments (Fig. 2) (Miele et al. 2011). Family/gene numbers of pan-genome, core-genome, and variable genome are 13117/26428, 1556/9518, and 11561/16910, respectively. B. thermoruber 423 has 957 unique proteins with putative functions that constitute 22.1 % of the total CDS. Gene counts for each organism are given in Table 2.

Fig. 2
figure 2

Venn diagram comparing the encoded proteins of B. thermoruber 423, B. brevis NBRC100599, B. massiliensis, B. laterosporus DSM25, and B. panacihumi W25. The numbers of shared and unique proteins are shown

Table 2 Pan/core genome analysis of the Brevibacillus genus

Transport

Transport systems are vitally important for all organisms. In the B. thermoruber 423 genome, 299 open reading frames (ORFs) (9.5 % of total) involved in various transport systems were found which enable the bacterium to accumulate necessary nutrients, extrude unwanted by-products, and maintain cytoplasmic content of protons and salts conducive to growth and development. Among these ORFs, 160 were categorized under subsystems whereas 139 were not assigned to a subsystem. Their products include transporters for ions (Na+, K+, Mg2+, Zn2+, Mn2+, Fe2+/3+, Co2+, Ni2+, and Mo2+), anions and cations (ammonium, phosphate, sulfate, alkanesulfonates, and malonate), amino acids (glutamate, aspartate, glycine, proline, methionine, cysteine, glutamine), and other molecules (biotin, hydroxymethylpyrimidin, vitamin B12, niacin-choline, tricarboxylate, glutathione, glycerate, gluconate, D-allose, aminobenzoyl-glutamate, cyanate, chromate, spermidine/putrescine, glycine betaine, benzoate, glycerol-3-phosphate) as well as transporters for multidrug resistance, which have homologs in various microorganisms (Supplementary Table S2). The genome also contains genes involved in quorum sensing systems such as those encoding ATP-binding cassette (ABC) transporters (Bth.peg.408, 409, 410, 411, 414) that secrete peptides as autoinducers for quorum sensing and two-component sensor histidine kinase proteins (Bth.peg.1035, 2288, 2440, 2487) for detection of the autoinducers.

Stress tolerance

Stress responses are genetic and physiological changes that cells go through in order to cope with the damages caused by extreme environmental disturbances. B. thermoruber 423 has two molecular chaperone systems: GroE mechanism and the Hsp70 system (Supplementary Table S1). Genes for several proteins that take a role in the regulation of heat response, such as heat-inducible transcription repressor HrcA (Bth.peg.161), ribosomal RNA small subunit methyltransferase E (Bth.peg.156), tmRNA-binding protein SmpB (Bth.peg.940), translation elongation factor LepA (Bth.peg.163), nucleoside 5-triphosphatase RdgB (dHAPTP, dITP, XTP-specific) (Bth.peg.400), ribosomal protein L11 methyltransferase (Bth.peg.157, Bth.peg.498), signal peptidase-like protein (Bth.peg.4386), phosphoesterase (Bth.peg.399), ribonuclease PH (Bth.peg.401), and rRNA small subunit methyltransferase I (Bth.peg.4382), were also present in the genome. Additionally, B. thermoruber 423 has also cold shock proteins CspB (Bth.peg.1417) and CspC (Bth.peg.3398), the major cold-shock-induced proteins of Bacillus sp. and cold-shock DEAD-box protein A (CsdA) (Bth.peg.1444, Bth.peg.1952, Bth.peg.3649) which has a helix-destabilizing activity (Beckering et al. 2002).

In recent studies, heat stress was observed not only to influence cellular components directly but also to induce oxidative stress, which affects global cellular processes that couple with heat (Shih and Pan 2011). According to these studies, in thermophiles thriving at temperatures above the mesophilic range, every component is continually exposed to the high temperatures of living environments and must adapt to function under these conditions; therefore, oxidative stress may be as important as heat stress at elevated temperature. B. thermoruber 423 has 29 ORFs assigned to oxidative stress subsystem that has the highest number of ORFs among the stress response subsystems (Supplementary Table S2).

Osmolarity resistance is one of the key stress responses of B. thermoruber 423. Accumulating intracellular organic solutes, such as ectoine and betaine, and rapidly releasing those solutes when extracellular osmolality declines is a frequently observed osmolarity resistance strategy when extracellular osmolality rises (Oren 2002).

B. thermoruber 423 possesses different glycine betaine (GB)/proline betaine (PB) high-affinity uptake systems—OpuA, known from Bacillus subtilis, and ProU and ProP, known from Escherichia coli. The OpuA system is a member of ABC transporter superfamily and consists of a membrane-associated ATPase (OpuAA, Bth.peg.974), the integral membrane protein (OpuAB, Bth.peg.975), and the extracellular ligand-binding protein (OpuAC, Bth.peg.976). The extracellular OpuAC protein binds GB or PB with high affinity in Bacillus subtilis bacteria (Hoffmann et al. 2013) and delivers it to the OpuAA/OpuAB protein complex for the release of substrate into the cytosol with an ATP-dependent substrate translocation mechanism. B. thermoruber 423 also has a second ABC transporter system ProU for GB/PB import into the cytoplasm. ProU transport system is composed of a GB/PB binding protein ProX (Bth.peg.1675) and permease protein ProV (proline glycine betaine1678). B. thermoruber 423 also has a third transporter system, ProP (Bth.peg.1843, Bth.peg.2786, Bth.peg.4293), that has a broad substrate specificity in E. coli and transports a similar set of substrates as ProU (Kappes et al. 1996).

In addition, aquaporinZ (Bth.peg.1237) should be an integral membrane protein that serves as channel in the transfer of water across the membrane and takes a role in osmoregulation of B. thermoruber 423. Furthermore, the presence of four enzymes associated with ectoine biosynthesis was also notable: diaminobutyrate-pyruvate aminotransferase (EC 2.6.1.46, Bth.peg.1281), ectoine hydroxylase (EC 1.17.-.-, Bth.peg.1283), L-ectoine synthase (EC 4.2.1.-, Bth.peg.1282), and L-2,4-diaminobutyric acid acetyltransferase (EC 2.3.1.-, Bth.peg.1280).

Carbohydrate uptake and utilization

Genes encoding the transporters for carbohydrates (maltose/maltodextrin, chitin and N-acetylglucosamine, trehalose, lactose and galactose, D-xylose, D-ribose, fructose, mannitol), and utilization systems for chitin and N-acetylglucosamine, trehalose, lactose and galactose, xylose, ribose, fructose, and mannitol were present in the genome of the B. thermoruber 423 (Supplementary Table S2).

In order to validate sugar uptake and utilization, B. thermoruber 423 cultures were grown in the presence of 12 different sugars as carbon sources. Growth and carbohydrate utilization profiles of the cultures showed that sucrose, arabinose, raffinose, mannose, and lactose were not metabolized which in turn was confirmed by the absence of the associated transport genes in the genome. On the other hand, maltose, fructose, mannitol, xylose, glucose, galactose, and glycerol were found to be utilized at differing rates via presumably the related transporter system and then metabolized to form biomass (Table 3). The highest uptake rates were obtained from cultures grown on maltose, followed by fructose, galactose, and xylose.

Table 3 Carbohydrate uptake and utilization by B. thermoruber 423

EPS biosynthesis

In Gram-positive bacteria, the polysaccharidic components of the cell wall involve (i) the capsular polysaccharides (CPSs) that are covalently bound to the peptidoglycan layer and form the thick outer layer (capsule), (ii) wall polysaccharides (WPSs) that are either physically or covalently attached to the cell wall and form a thin outer layer (pellicle), and (iii) exopolysaccharides (EPSs) that are not attached to the cell wall released to the cellular environment (Chapot-Chartier 2014; Nicolaus et al. 2010). B. thermoruber 423 is known to produce EPS, which is a heteropolymer of glucose/galactose/mannose/galactosamine/mannosamine (57.7/16.3/9.2/14.2/2.4) (Yasar Yildiz et al. 2014). In previous studies, when EPS production capacity of B. thermoruber 423 in the presence of different carbon sources was screened, maltose and xylose were found to promote production at comparable levels, and further optimization studies were conducted with maltose (Yasar Yildiz et al. 2014). Also, in this study, the EPS yields of the cultures grown in 12 different carbon sources were compared (Table 3). Whereas undetectable levels of EPSs were obtained in the presence of sucrose, arabinose, raffinose, mannose, and lactose, the highest yields were obtained from cultures grown on maltose (0.6 g/L) and xylose (0.5 g/L).

Comparative Brevibacillus genome analysis

Brevibacillus includes 20 species, and so far, a total of 11 genome announcements for the strains belonging to the mesophilic species B. agri (Joshi et al. 2013), B. brevis (Che et al. 2013; Chen et al. 2012), B. laterosporus (Djukic et al. 2011; Sharma et al. 2012), B. massiliensis (Hugon et al. 2013), and B. panacihumi (Wang et al. 2014) as well as the thermophilic species B. thermoruber 423 (Yasar Yildiz et al. 2013) were reported. These 11 genomes were comparatively analyzed in terms of carbohydrate utilization and EPS production mechanisms (Supplementary Table S4). Twenty-five subsystems and proteins that take a role in sugar utilization or EPS production but are not assigned to a subsystem were also analyzed. Published literature and genome information of these 11 strains revealed biologically and biotechnologically significant characteristics of the Brevibacillus genus and similarities/differences within the genus itself.

In literature, Brevibacillus species are reported to be unable to ferment D-xylose and L-arabinose due to the lack of the related genes like xylA (xylose isomerase), xylB (xylulose kinase), araQ (arabinose permease), and araA (arabinose isomerase) (Panda et al. 2014). Accordingly, comparative genome analysis indicated that B. thermoruber 423 as well as the other members of the genus did not have the required genes for fermenting L-arabinose. However, unlike the other species, B. thermoruber 423 was found to be able to utilize D-xylose. D-xylose transport genes encoding ATP-binding protein (XylG, Bth.peg.3948), periplasmic xylose-binding protein (XylF, Bth.peg 3949), xylose isomerase (EC 5.3.1.5, Bth.peg.3943), and xylulose kinase (EC 2.7.1.17, Bth.peg.3945) were present in the genome. Likewise, utilization of D-allose was also confined to B. thermoruber 423 only such that the D-allose ABC transporter genes (Bth.peg.1793, 1794, and 1795), D-allose kinase gene (EC 2.7.1.55, Bth.peg.1798), and the gene for the transcriptional regulator of D-allose utilization (Bth.peg.1791) were present in the genome.

There are mainly three biosynthetic routes for microbial EPSs. Synthesis of some glucan- or fructan-type homopolysaccharides is catalyzed by the action of sucrase enzymes (Rehm 2009). These glucansucrases and fructansucrases are glycoside hydrolases that act on sucrose and catalyze the transglycosylation reactions forming the polymer chain. Lack of the genes encoding these enzymes in the genomes of these six Brevibacillus species suggested the absence of the biosynthesis of glucans or fructans.

The other two pathways for heteropolysaccharides and some homopolysaccharides are more complex and require the uptake of sugar monomers like glucose, fructose, mannose, xylose, or glycerol. These sugar monomers are then converted into nucleoside diphosphate sugars (NDP-sugars) which in turn are assembled on an isoprenoid lipid–phosphate that is located in the cytoplasmic membrane. At this step, repeating monosaccharide units are sequentially transferred from sugar nucleotides by glycosyltransferases, modified by the addition of any acyl groups and then polymerized. Finally, the polysaccharide is secreted from the cell membrane into the extracellular environment (Sutherland 2007). Genome analysis revealed the presence of several genes that could be involved in four distinct steps of the EPS mechanism (Table 4). Genome comparison showed that most of the genes required for NDP-sugar biosynthesis were present in all Brevibacillus species. However, the uridylytransferases (EC 2.7.7.9 and EC 2.7.7.10) that are required for UDP-glucose biosynthesis and mannose-1-phosphate guanylyltransferases (EC 2.7.7.13 and EC 2.7.7.22) required for GDP-mannose biosynthesis were not present in all four strains of B. laterosporus (Supplementary Table S4). All the compared genomes were found to contain the essential genes for isoprenoid lipid–phosphate biosynthesis and various glycosyltransferases. Polysaccharide deacetylase enzymes were also present in all species, but chitooligosaccharide deacetylase (EC 3.5.1.-) was only annotated for B. massiliensis (Supplementary Table S4). For the transport of the polysaccharides, there are mainly two mechanisms (Sutherland 2007). The first mechanism involves the export of the polymer across the cytoplasmic membrane by an ABC transporter, and the presence of the associated genes in the genome of B. thermoruber 423 suggested this mechanism for the EPS export (Table 4). In the second mechanism, isoprenoid-linked intermediates are flipped through the cytoplasmic membrane and then polymerized by a Wzx/Wzy-dependent pathway (Sutherland 2007). Genome comparison revealed that from the species considered, only B. massiliensis was found to have the O-antigen flippase (Wzx), the repeat unit polymerase (Wzy), and tyrosine-protein kinase (Wzc, EC 2.7.10.2) genes suggesting the presence of the latter mechanism. Absence of TPR/glycosyl transferase domain protein in this species serves as an additional proof for the presence of the Wzx/Wzy-dependent pathway (Supplementary Table S4).

Table 4 Predicted genes and proteins involved in EPS biosynthesis

Using the genome information of B. thermoruber 423, a preliminary model for EPS biosynthesis mechanism is proposed (Fig. 3).

Fig. 3
figure 3

Proposed EPS biosynthesis mechanism of B. thermoruber 423

Features of biotechnological interest

Microbial thermophilic enzymes are known to play a crucial role as metabolic catalysts, leading to their widespread use in various applications in sectors including chemical, textile, food, medical, and pharmaceutical industries (Kumar et al. 2011). Whole-genome sequencing of B. thermoruber 423 revealed the presence of many industrially significant enzymes such as lipases, proteases, glucoamylase, and cellulase (Table 5).

Table 5 Industrially important enzymes predicted by genome analysis of B. thermoruber 423

Besides the Lon (Lee et al. 2004) and fibroinolytic (Suzuki et al. 2009) proteases of B. thermoruber, some industrially significant enzymes from members of Brevibacillus such as thermostable D-alanine amidase from B. borstelensis BCS-1 (Baek et al. 2003), gelatinase from B. laterosporus (Hamza et al. 2006), and bile salt hydrolase from Brevibacillus (Sridevi and Prabhune 2009) have been reported. Experimental evidence is required to assess the presence of these enzymes as well as those listed in Table 5 in B. thermoruber 423.

Presence of thiol-disulfide oxidoreductase Bbd was reported at the periphery of B. choshinensis (Ishihara et al. 1995) that provides the facility to form a disulfide bond; a putative thiol-disulfide oxidoreductase (Bth.peg.2077) was also reported in the annotation of the B. thermoruber 423 genome.

Isoprenoids belong to a vast group of secondary metabolites that include carotenoids, sterols, polyprenyl alcohols, ubiquinone (coenzyme Q), heme A, and prenylated proteins. They are of valuable commercial interest as food colorants and antioxidants (carotenoids), aroma and flavor enhancers (terpenes), nutraceuticals (ubiquinone), and antiparasitic and anticarcinogenic compounds (taxol) (Sharma et al. 2014). Isoprenoids are known for their diverse biological functions, such as their use as hormones (steroids, gibberellins, and abscisic acid) and their roles in membrane fluidity maintenance (steroids), respiration (quinones), photosynthetic light harvesting (carotenoids), and protein targeting and regulation (prenylation and glycosylation) (Chang and Keasling 2006). They are produced by the deoxyxylulose phosphate (DXP) pathway and/or the mevalonate (MVA) pathway. The genome annotation results showed that B. thermoruber 423 possesses the DXP pathway for isoprenoid biosynthesis and 17 genes were assigned to this subsystem (Supplementary Table S1).

Genome annotation results showed that B. thermoruber 423 has the gene encoding alcohol dehydrogenase (EC 1.1.1.1, Bth.peg.1036) converting acetaldehyde to ethanol. Moreover, all the genes necessary for the bioconversion of (S)-3-hydroxybutanoyl-CoA to butanol, namely the genes for enoyl-CoA hydratase (EC 4.2.1.17, Bth.peg.492), butyryl-CoA dehydrogenase (EC 1.3.99.2, Bth.peg.116), acetaldehyde dehydrogenase (EC 1.2.1.10, Bth.peg.2921), and NADH-dependent butanol dehydrogenase A (EC 1.1.1.-, Bth.peg.3374), were present in the genome. All these genes for ethanol and butanol biosynthesis were also annotated in the other Brevibacillus strains except that the butanol dehydrogenase gene catalyzing the final step in the pathway was not present in the four genomes of B. laterosporus species (Supplementary Table S4).

The B. thermoruber 423 draft genome also carries multiple genes that are potentially involved in arsenic resistance. Genes for “arsenate reductase (EC 1.20.4.1),” “arsenical resistance protein, ACR3,” “arsenic efflux pump protein,” “arsenical resistance operon repressor,” and “transcriptional regulator, ArsR family” were present within the draft genome.

Discussion

The microbial genome data form the basis of systems biology research including functional genomics and metabolic engineering studies. Therefore, whole-genome sequencing of B. thermoruber strain 423 was performed, which will not only provide additional information to enhance our understanding of the genetic and metabolic network of thermophilic bacteria but also accelerate research on rational design and optimization of microbial EPS production. Moreover, identification of genes encoding valuable commercial compounds could contribute to the development of industrially important processes with this organism.

Despite the vast amount of reports on thermophiles, there is very limited information about their EPS production capacity. In previous studies, when thermophiles isolated from hot spring water samples taken from Turkey and Bulgaria were screened, B. thermoruber strain 423 stood out with its EPS production capacity such that bioreactor cultures were found to reach two times higher yields and three times higher productivities when compared with literature (Yasar Yildiz et al. 2014). Moreover, chemical, rheological, and biological characterization of the produced EPS was also reported, and when grown on maltose, the polymer was found to be mainly composed of glucose (57.7 mol%) followed by galactose (16.3 mol%) and galactosamine (14.2 mol%) and in lesser amounts of mannose (9.2 mol%) and mannosamine (2.4 mol%) residues (Yasar Yildiz et al. 2014). The monomeric composition of EPSs highly depends on the carbohydrate uptake and utilization capacity of the producer microorganism, and for each monomeric unit, biosynthesis of its activated sugar nucleotide precursor is required. When the sugar, biomass, and EPS profiles of B. thermoruber 423 cultures grown in the presence of different carbon sources were analyzed, sucrose, arabinose, raffinose, mannose, and lactose were found to be not metabolized which in turn was confirmed by the undetectable levels of biomass and EPS as well as the absence of the associated transport genes in the genome. On the other hand, maltose, fructose, mannitol, xylose, glucose, galactose, and glycerol were found to be utilized at differing rates yielding biomass and EPS (Table 3). In agreement with previous results, maltose was found to be the best carbon source for EPS production.

Essential genes associated with EPS biosynthesis were detected by genome annotation, and together with experimental evidences, a hypothetical mechanism for EPS biosynthesis was generated (Fig. 3). In this mechanism, the pathways for sugar uptake as well as the biosynthesis of NDP-sugars were shown. However, the annotated glycosyltransferases (Table 4) should be characterized more specifically such that their role in the whole process can be identified more precisely.

Despite the structural diversity of EPSs, there are only two mechanisms for the polymerization, namely ABC-transporter-dependent and the most commonly used Wzx/Wzy-dependent pathways. Genome information revealed the presence of ABC-transporter-dependent pathway (Bth.peg.2228, Bth.peg.4273, Bth.peg.3612, Bth.peg.3618, Bth.peg.4275) in EPS biosynthesis by B. thermoruber 423. GAF domain/GGDEF domain protein (Bth.peg.3214) is an inner membrane protein that is essential for EPS production (Lee et al. 2007). The N-terminus of this protein contains transmembrane (TM) domains. The presence of these TM domains results in the N-terminus of EPS being positioned in the inner membrane and the C-terminal region being in the cytoplasm where it interacts with lipid carrier and regulates EPS production (Lee et al. 2007). TPR-like proteins (Bth.peg.511, Bth.peg.430, Bth.peg.2658, Bth.peg.895, Bth.peg.3214) function as scaffold proteins and help in the assembly of a secretion complex. Multimerization of TPR-like protein and/or its association with the TM domains of GAF domain/GGDEF domain protein provides the required portal in the inner membrane for the export of the polymer (Franklin et al. 2011). Polysaccharide export proteins (OPX family of proteins) are lipoproteins that adopt an octameric configuration with a large central cavity that facilitates EPS export through the periplasm and across the outer membrane. B. thermoruber 423 have protein coding gene regions (Bth.peg.255, Bth.peg.3073, Bth.peg.4156, Bth.peg.4336) for lipoprotein that are potential actors in EPS export. Presence of these multiprotein complexes should be demonstrated experimentally by protein-protein interaction studies.

Several microorgansims are known to produce more than one type of EPS, each being synthesized by a certain gene cluster. Hence, using the genome annotations of this study, the genetic structure of the operon(s) for EPS biosynthesis should be elucidated and the exact number of different EPSs produced by this organism should be verified both in silico and in vitro.

Bis-(3′,5′)-cyclic-dimeric-guanosine monophosphate (c-di-GMP) is known to act as an allosteric regulator of cellulose and alginate biosynthesis, and as both an allosteric and transcriptional regulator of Pel EPS (Hengge 2009). Presence of such a regulator should be investigated for EPS biosynthesis by B. thermoruber 423. The dependence of EPS biosynthesis on isoprenoid lipid carrier should also be clarified by investigating the biosynthetic proteins. Generally, the proposed mechanism calls for experimental evidence by use of molecular genetics and omics tools.

d-Xylose, also called as wood sugar, is the principal constituent of xylan polysaccharide which is the main structural component of the hemicellulose in plant cell walls and thus one of the most abundant heteropolymer present in nature. In B. thermoruber 423, probable xylose metabolism is that xylose is first converted to xylulose by the action of xylose isomerase (EC 5.3.1.5, Bth.peg.3943) and then phosphorylated by xylulose kinase (EC 2.7.1.17, Bth.peg.3945) and then metabolized through the pentose phosphate pathway. Besides xylose utilization, B. thermoruber 423 genome was also found to encode for endo-1,4-β-glucanase (cellulase) enzyme and have the complete pathways for ethanol and butanol production. All these features make this strain an ideal candidate for biofuel production, especially for consolidated bioprocessing (CB) where cellulosic biomass is directly converted to liquid biofuels. At high temperatures, when thermophiles are used, additional advantages like low risk of contamination and toxicity, high rates of mixing, and direct recovery of distillate make the process even more appealing. Hence, the suitability of B. thermoruber 423 for such next-generation biofuel processes can be investigated by assessing its cellulolytic, etanologenic, and butanologenic activities and then by determining the ethanol/butanol yields of cultures growing on different biomass resources. Similarly, lipases of B. thermoruber 423 can also be characterized to elucidate their potential use in detergent or biodiesel industries.

Presence of arsenic resistance genes suggests that B. thermoruber 423 is an arsenite-specific expulsion prokaryote rather than a dissimilatory arsenic-reducing prokaryote (Lloyd 2010). The predicted arsenic resistance genes in the genome of B. thermoruber 423 put it as a candidate organism for bioremediation studies.

The genome annotation results for B. thermoruber 423 include genetic information such as genome position, coding region, gene product function, and Enzyme Commission (EC) numbers. All this information would play an important role in reconstruction of a mathematical metabolic model of B. thermoruber 423 and similar strains. Such models could provide the opportunity to develop metabolic engineering strategies for B. thermoruber 423 in order to achieve medium optimization and genetic modifications, to increase biopolymer production, and to modify EPS monomer composition. Consequently, the whole-genome information will be a crucial pathfinder toward upcoming in silico and in vivo studies of this organism.

Although the annotated functions definitely call for experimental evidence, results of this study are significant in order to put forth the biotechnological and industrial potential of the thermophilic B. thermoruber 423, to understand the biological mechanisms along with whole genomic organization of this bacterium, and to clarify the metabolism of the microorganism. This understanding is crucial for design and optimization of engineering strategies for both overproduction and improved properties of industrially important products, especially EPSs. Current studies are focused on these topics, and the genome sequence of B. thermoruber strain 423 is being used for the reconstruction of a genome-scale mathematical model.