Introduction

Combustion of sulphur-containing compounds in fossil fuels emits sulphur oxides causing adverse effects on health, environment and economy (Diaz and Garcia 2010). Sulphur in fossil fuels can be removed through conventional methods however; refractory organic sulphur is very difficult to remove. The conventional methods are very costly and energy intensive. There are other methods, such as biodesulphurization (BDS), which have shown good potential for removing refractory sulphur under milder conditions (Soleimani et al. 2007).

Dibenzothiophene (DBT) and its derivatives are generally considered model target compounds for biodesulphurization studies (Mohebali and Ball 2008) and a significant number of microorganisms have been found that remove sulphur from DBT via the sulphur specific (4S) pathway. In this pathway, DBT is desulphurized to 2-hydroxybiphenyl (2-HBP) without degradation of the carbon–carbon bond and thus the calorific value of the fuel remains conserved (Akhtar et al. 2009; Diaz and Garcia 2010; Soleimani et al. 2007; Mohebali and Ball 2008). Though, the genes involved in microbial DBT desulphurization have been isolated from various bacterial strains, only the genes of R. erythropolis strain IGTS8 (dszA, dszB, dszC) have been extensively studied and characterized (Piddington et al. 1995). The DBT desulphurizing genes constituting operons, have also been isolated and cloned from thermophilic bacteria Paenibacillus sp. A11-2 (tdsABC), Bacillus subtilis WU-S2B (bdsABC) and Mycobacterium phlei WU-F1 (tdsABC) (Konishi et al. 1997; Ishii et al. 2000; Kirimura et al. 2001, 2004; Furuya et al. 2001).

All known bacteria with the ability to desulphurize DBT into 2-HBP through the sulphur specific pathway possess three desulphurization enzymes. In R. erythropolis IGTS8, the DszC enzyme catalyzes two consecutive monooxygenation reactions converting DBT to DBT sulphone, subsequently DszA as a second monooxygenase enzyme converts the DBT sulphone to hydroxyphenyl benzenesulphinate (HPBS). Finally, DszB as a desulphinase enzyme transforms HPBS to 2-HBP and sulphite (Gray et al. 1996, 2003). All three genes are clustered on a 120 kb linear plasmid of strain IGTS8 (Denome et al. 1994).

Earlier, we have reported that Rhodococcus spp. (Eu-32) is an organic sulphur metabolizing culture with the ability to convert DBT into 2-HBP through an extended sulphur specific (4S) pathway with biphenyl as a second end product (Akhtar et al. 2009). In this paper, we report the work done for the isolation and identification of dszABC genes of Rhodococcus spp. (Eu-32). We used a range of different degenerate and standard (non degenerate) primers for the PCR amplification of these genes, individually or jointly. The dszA and dszC genes catalysing the first 3 steps of DBT desulphurization to the HPBS were identified and phylogenetically analysed. In the last step of desulphurization, R. erythropolis strain IGTS8 converts HPBS to 2-HBP using the dszB gene. In Eu-32 this last step is further followed by a dehydroxylation step to biphenyl suggesting that either a fourth enzyme or a significantly different DszB enzyme is involved in the pathway.

Materials and methods

Bacterial culture and growth conditions

Rhodococcus spp. (Eu-32) used in this study was previously isolated from a soil sample taken from the roots of a eucalyptus tree. Isolate Eu-32 was cultured either in MG or LB medium as described earlier (Akhtar et al. 2009).

Primers used to PCR amplify the dszABC genes

Two types of primers (standard and degenerate) were used in order to amplify the dszABC gene sequences of Rhodococcus spp. (Eu-32) by polymerase chain reaction (PCR). Standard primers were based on the 5′ and 3′ termini of dsz genes of Rhodococcus erythropolis IGTS8. The nucleotide sequences of the degenerate PCR primers were designed using conserved amino acid sequence regions of published DBT-desulphurizing enzymes from R. erythropolis IGTS8 (DszA AAA99482.1, DszB AAA99483.1 and DszC AAA99484.1), Paenibacillus sp. A11-2 (TdsA BAA94831.1, TdsB BAA 94832.1 and TdsC BAA94833.1), Bacillus subtilis WU-S2B (BdsA BAC20180.1, BdsB BAC20181 and BdsC BAC20182.1) and Gordonia alkanivorans strain 1B (DszA AAT78716.1, DszB AAT78717.1 and DszC AAT78718.1). Oligonucleotides for PCR amplification of dszABC genes were purchased from Integrated DNA Technologies, Inc., UK. The sequences of the standard and degenerate primers used in this study are shown in Table 1.

Table 1 List of primers used for the PCR amplification of dszABC genes of Rhodococcus spp. (Eu-32)

The DNA PCR and molecular biology techniques

Rhodococcus spp. (Eu-32) was grown in LB medium overnight. Total genomic DNA from Rhodococcus spp. was extracted using the method described by Ausubel et al. (1995). The extracted DNA was stored at −20 °C until required. PCR amplifications were performed with Bioline Biotaq polymerase (UK). The dNTPs mixture (10 mM) was purchased from Fermentas (ThermoFisher, Scientific, USA). Amplification was achieved with 1 cycle of 5 min of denaturation at 95 °C, 30 cycles of 1 min of denaturation at 95 °C, 1 min of annealing at 55 °C and 2 min of extension at 72 °C, plus a final additional extension at 72 °C for 15 min with a Geneflow, thermal cycler, UK. The PCR amplified target bands were gel extracted by using NucleoSpin Extract II Gel Extraction Kit (Germany) as per manufacturer recommendations. Escherichia coli TOP10 one shot competent cells (Invitrogen, USA) were used as host strains for general cloning. Transformation of DNA was performed according to the supplier’s recommendation. Published recombinant techniques (Sambrook et al. 1989) were used unless otherwise noted. DNA fragments amplified by PCR were ligated to the cloning vector pGEMT-easy (Promega). Plasmid DNA was isolated using PureLink™ HQ Mini Plasmid Purification Kit (Invitrogen, USA). Cloned fragments were subsequently sequenced through Source Bioscience, UK using the M13 or T7 forward and reverse primers.

Blast and phylogenetic analysis of the genes

The determination of DNA sequence identity was performed using the basic local alignment search tool (BLAST) program “tblastx” of the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov). Phylogenetic analysis was performed by maximum likelihood (ML) method using the programme MEGA5 (Tamura et al. 2011). The Tamura–Nei model (Tamura and Nei 1993) for 16S rRNA gene and Jones–Thornton–Taylor (JTT) matrix-based model (Jones et al. 1992) for DszA and DszC amino acid sequences were used. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using either Maximum Composite Likelihood (MCL) approach or JTT model, and then selecting the topology with superior log likelihood value. Reliability for internal branch was assessed using the bootstrapping method (100 bootstrap replicates). All positions containing missing data and gaps were eliminated. The % identity between the deduced amino acid sequences and others was determined using the Clone Manager suite version 7.0.

Nucleotide sequence accession numbers

Nucleotide sequences from partial dszA and dszC genes were deposited with GenBank of NCBI (www.ncbi.nlm.nih.gov) under accession numbers KF031127 and KF031128 respectively. The nucleotide and protein sequences used in this study for phylogenetic analysis were obtained from the NCBI and their accession numbers are given in Tables 2 and 3. The 16S rRNA gene sequence for strain Eu-32 was previously deposited under accession number DQ386111.

Table 2 Bacterial species names and NCBI accession numbers of 16S rRNA gene sequences used
Table 3 Bacterial species names and NCBI accession numbers of DszA and DszC sequences used

Results

For PCR amplification and characterization of the presumed dszABC genes involved in C–S bond cleavage of Rhodococcus spp. (Eu-32), we designed specific primers based on the 5′ and 3′ regions of dszABC genes of R. erythropolis IGTS8. These primers generated multiple bands, despite stringent PCR conditions and did not amplify homologous dsz genes from Rhodococcus spp. (Eu-32), data not shown. Consequently degenerate PCR primers were designed to amplify the putative dszABC genes of Eu-32. For this purpose, an alignment of the protein sequences of dissimilar desulphurization enzymes of Rhodococcus erythropolis IGTS8 (AAA99482), Gordonia alkanivorans strain 1B (AAT78716), Bacillus subtilis WU-S2B (BAC20180) and paenibacillus sp. A11-2 (BAA94831) was made, which revealed multiple conserved regions (Supplementary Figs. 1, 2, 3). Based on these conserved amino acid regions and inferred nucleotide sequences degenerate primers were designed.

Isolation of dszA and dszC genes

For isolation of dszA gene of Rhodococcus spp. (Eu-32), PCR reactions were performed with primers based on BdsA of B. subtilis WU-S2B, TdsA of Paenibacillus spp. A11-2 and the DszA of R. erythropolis IGTS8 and G. alkanivorans 1B. The DszA enzyme of R. erythropolis IGTS8 and G. alkanivorans 1B consists of 453 and 474 amino acids respectively. The BdsA of B. subtilis WU-S2B and TdsA of P. bacillus A11-2 consist of 453 and 454 amino acids respectively. The DszA/BdsA/TdsA enzyme alignment contains 15 conserved regions (Supplementary Fig. 1). Keeping in view the G + C contents and redundant positions in the corresponding nucleotides sequences of the conserved amino acid regions, different degenerate primers were designed to isolate dszA gene of Eu-32. Kilbane and Robbins in 2007 reported four conserved motifs that can be diagnostic for desulphurization enzymes and allow the construction of the most useful PCR primers. These amino acid sequences were HLAGFF, ARTLERG, FDLLFLPDGLA, and RYDRADEFL that correspond to positions 8–13, 44–50, 52–62, and 159–167, respectively in the DszA protein of R. erythropolis IGTS8. In our study, the amino acid regions ARTLERG (5′GCSCGKACHCTSGAGCGSGGC3′), RYDRADEFL (5′AAYTCRTCRGCVCGGTCRTABCG3′) and PVILQAG (5′CCSGCCTGMADGATSACMGG3′) were fully conserved and facilitated primer design for the PCR amplification of the target regions of the dszA gene of Eu-32. The sequences of amplicons obtained using these degenerate primers allowed us to construct other non-degenerate primers that were specific for the dszA gene of Rhodococcus spp. (Eu-32), see Table 1. Using these primers we determined a 496 bp sequence of dszA gene of isolate Eu-32. tblastx analysis of this 496 bp dszA DNA sequence from isolate Eu-32 showed 92 % homology with a putative-FMNH2 dependent monooxygenase of R. erythropolis PR4 DNA, complete genome sequence. Alignment of the deduced amino acid sequences of DszA of isolate Eu-32 with DszA enzymes from other desulphurizing bacteria showed that DszA has low homology in the regions conserved within desulphurizing enzymes (Fig. 1a).

Fig. 1
figure 1

a Multiple alignments of the partial deduced amino acid sequence of DBTO2 monooxygenase from Eu-32 with its equivalents from other strains, showing that boxed regions are highly conserved among the desulphurizing bacteria but, not in isolate Eu-32. b Multiple alignments of the partial deduced amino acid sequence of DBTO monooxygenase from Eu-32 with its equivalents from other strains, showing that boxed regions are highly conserved among different desulphurizing bacterial species as well as in isolate Eu-32. The valine amino acid has been shown by a bold arrow sign. Alignment was carried out using ClustalX 1.81 software under the default settings (multiple alignment parameters: gap opening 10.00 and gap extension 0.20)

For isolation of dszC gene of Rhodococcus spp. (Eu-32), the two conserved regions of DszC protein sequences (Supplementary Fig. 2) were used to design PCR primers. A 23 bp forward PCR primer with four redundant positions was constructed targeting the region GNASSENN (5′GGRAAYGCMTCCAGCGARAACAA3′) and a 24 bp primer with five redundant positions was constructed targeting the GFDRFWR (5′GTYGCGCCARAASCGGTCRAADCC3′) region of DszC. The PCR amplification of dszC gene carried out using degenerate primer pair GNASSENN-CF and GFDRFWR-CR, resulted in a major amplicon of approximately the expected size (771 bp) which was cloned and sequenced. tblastx analysis of the sequence showed identity with desulphurizing enzymes of several genera including Rhodococcus, Nocardia, Gordonia, Bacillus, Mycobacterium, Agrobacterium and Paenibacillus. Alignment of the deduced amino acid sequences of dszC of isolate Eu-32 and equivalent proteins, from other desulphurizing bacteria showed that DszC has high homology in the regions conserved within desulphurizing enzymes (Fig. 1b).

Attempted isolation of dszB gene

The DszB/BdsB/TdsB of R. erythropolis IGTS8, G. alkanivorans 1B, B. subtilis WU-S2B and P. bacillus spp. A11-2 consists of about 353–365 amino acids and contains many conserved regions (Supplementary Fig. 3). Four conserved regions were judged to be appropriate for the construction of the degenerate primers. These conserved regions are AYTRFGG, ASVWTVS, VQRLVD, and WAAPEFL that correspond to amino acid positions 67–73, 252–258, 269–274, and 351–357, respectively, of DszB of R. erythropolis IGTS8. The PCR reactions set up using degenerate primers for amplification of dszB gene of Eu-32 yielded several amplicons, those of the expected size bands were cloned and sequenced. The DNA sequences showed identity with genome sequences of several Rhodococcus species, but not to dszB genes.

Phylogenetic analysis

Phylogenetic analysis is a powerful tool for sorting and interpreting molecular data. Phylogenetic comparisons using both the desulphurizing enzymes DszA and C and the 16S rRNA gene sequence of Eu-32 was carried out with other relevant organisms from the same genera. For the 16S gene comparison we choose 13 sequences, including at least two sequences of 1,200–1,400 bp from each genera and/or species included in the DszA and C trees. The phylogenetic analysis on the basis of 16S rRNA gene sequence showed that the species, as expected, are forming clades within their respective genera (Fig. 2). The phylogenetic analysis of the DszC sequence of isolate Eu-32 and other desulphurizing bacterial monooxygenases (Fig. 3) showed that Eu-32 also forms a close clade with Agrobacterium tumefaciens, Gordonia alkanivorans, Rhodococcus sp. SDUZAWQ, Rhodococcus sp. XP and R. erythropolis IGTS8. On the other hand, phylogenetic analysis based on DszA sequences of the FMNH2 dependent monooxygenases, Fig. 4, shows that Eu-32 is only distantly related to other Rhodococcus species, indeed it forms an independent clade along with Paenibacillus sp. A11-2.

Fig. 2
figure 2

Molecular Phylogenetic analysis based on 16S rRNA gene sequences, showing the relationship of isolate Eu-32 with other bacterial species by Maximum Likelihood method. The tree with the highest log likelihood (−4,420.5508) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.5580)). The rate variation model allowed for some sites to be evolutionarily invariable ([+ I], 31.1375 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site

Fig. 3
figure 3

Molecular Phylogenetic analysis based on the partial FMNH2 dependent DBTO monooxygenase (DszC) sequence of Eu-32 (in this study) and other desulphurizing bacterial species by Maximum Likelihood method. The tree with the highest log likelihood (−1,773.7158) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 3.2778)). The rate variation model allowed for some sites to be evolutionarily invariable ([+ I], 20.8929 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site

Fig. 4
figure 4

Molecular phylogenetic analysis based on the partial FMNH2 dependent DBTO2 monooxygenase (DszA) sequence of Eu-32 (in this study) and other desulphurizing bacterial species by Maximum Likelihood method. The tree with the highest log likelihood (−1,261.6846) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 3.0844)). The rate variation model allowed for some sites to be evolutionarily invariable ([+ I], 12.3468 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site

Discussion

Generally, if two organisms are related, their DNA sequences for a particular gene will show some degree of identity. The desulfurization genes have often been shown to be conserved among Rhodococcus species (Duarte et al. 2001). We reported earlier, that Rhodococcus spp. (Eu-32) desulphurized DBT through a novel extended 4S pathway (Akhtar et al. 2009) and being related to the genus Rhodococcus, we presumed that it may also harbour DBT desulphurizing genes (dszABC genes) similar to the well documented desulphurizing bacterium Rhodococcus erythropolis sp. IGTS8. But, the attempts to use PCR primers based on the 5′ and 3′ termini R. erythropolis IGTS8 desulfurization genes were completely unsuccessful in obtaining appropriate portions of the desulfurization genes from Rhodococcus spp. (Eu-32). The redundant nature of the genetic code makes it possible for DNA sequences to vary from one species of microorganism to another yet encode for proteins with identical amino acid sequences and vice versa. Therefore, examining protein sequences may allow the detection of conserved regions of genes that might not be as readily detected by comparing DNA sequence data (Kilbane and Robbins 2007). Conserved and unique regions of desulfurization proteins were identified, and degenerate PCR primers targeting these sites were designed.

The dszA gene (496 bp) of isolate Eu-32 showed that in BLAST search the most closely related sequence was from Rhodococcus erythropolis PR4 with 92 % sequence identity to a putative FMNH-2 dependent monooxygenase. An analysis of the deduced 164 amino acid sequence from the 496 bp dszA gene of isolate Eu-32 using the BLAST tools on the NCBI web site showed 77 % identity with putative FMNH2 dependent monooxygenases of several bacterial species. Low identities (range 32–37 %) were found between the partial DszA of isolate Eu-32 and other corresponding DBT desulphurizing monooxygenases. The partial DszA of isolate Eu-32 showed only 36 % sequence homology with DszA of Rhodococcus erythropolis sp. IGTS8 (Table 4), despite the organisms being closely taxonomically related, see Fig. 2. This degree of homology was similar to that shown for other FMNH2-dependent monooxygenases from taxonomically distant organisms such as the SnaA, pristinamycin IIA synthase subunit A of Strepromyces pristinaespiralis, 42 % homology (Blanc et al. 1995) and NtaA, nitrilotriacetate monooxygenase of Chelatobacter heintzii, 39 % homology (Knobel et al. 1996). An analysis of changes in the putative amino acid sequence between the standard sequence of strain IGTS8 and Eu-32 sequence suggested that, over the 164-amino-acid region, there were 21 amino acids gaps at different positions, see Fig. 1a. Some highly conserved protein motifs present in all other known DszA enzymes are significantly mutated in Eu-32 yet a functional DszA protein is present. These changes did not result in a substantial modification of the function of the gene, as established from the analysis of the metabolites of extended 4S pathway of the Eu-32 (Akhtar et al. 2009).

Table 4 The identity of deduced DszA and DszC partial sequences of Rhodococcus spp. (Eu-32) with related partial sequences of desulphurizing bacteria

The deduced 257 amino acid sequence from the 771 bp dszC gene of isolate Eu-32 showed high homologies (range 53–99 %) with dibenzothiophene monooxygenase from different bacterial species determined through alignments, as shown in Table 4. The valine at position 261 of DszC of R. erythropolis IGTS8 is important to the C–S bond cleavage specificity in refractory organic sulphur containing compounds (Kirimura et al. 2004; Kilbane and Robbins 2007; Arensdorf et al. 2002). This Val residue is also conserved in DszC of Rhodococcus spp. (Eu-32) (see Fig. 1b).

In other attempts to get complete PCR amplification of the dszABC operon of isolate Eu-32, the GFDRFWR region of DszC protein was used in combination with the primers specific for the dszA of isolate Eu-32. However, these primers did not amplify the target region of dszABC operon. Thus it is possible that the presumed three genes may not be present in an operon form in Eu-32, and each gene may have an independent promoter at 5′ end. Moreover, it may also be possible that the DNA sequences for presumed dszB gene of isolate Eu-32 are significantly different from those previously described sequences upon which we based our primer designs (Piddington et al. 1995; Ishii et al. 2000; Kirimura et al. 2004; Denome et al. 1994; Alves et al. 2007). Indeed Lee and co-workers (2006) observed that proteins having homology with DszB are often found in genomic contexts that differ from the dsz operon in many microbial species hinting at the presence of as yet to be discovered sulphur metabolic pathways. Likewise, we may conclude that as the dszA gene of Eu-32 was found in genome context of R. erythropolis PR4, that differ from the dsz operon with no discovered sulfur metabolic pathway so, it could be possible that other genes (like dszB) may also be homologous with genome sequence of various microbial species that has yet to be discovered for their desulfurizing abilities. Additionally Eu-32 shows a novel final step in the desulphurization pathway catalysed by DszB which liberates biphenyl from 2-HBP suggesting that the enzyme is significantly different to those previously described (Fig. 5).

Fig. 5
figure 5

Extended 4S pathway of DBT desulphurization (showing the genes involved in different steps) by isolate Eu-32 (modified from Akhtar et al. 2009). Abbreviations are as follows: DBT dibenzothiophene, DBTO DBT sulfoxide, DBTO 2 DBT sulfone, HPBS hydroxyphenyl benzenesulfinate, 2-HBP 2-hydroxybiphenyl

Phylogenetic analysis and evidence for horizontal gene transfer (HGT)

Evidence for horizontal gene transfer events can be detected using common phylogenetic methods (Pylro et al. 2012). Four major methods namely distance, parsimony, maximum likelihood and bayesian are widely used for phylogenetic tree construction (Hall 2013). In our study we use the maximum likelihood (ML) method using the programme MEGA5 (Tamura et al. 2011). We choose ML method because it can perform better than distance or parsimony due to its statistical properties and inferential power on sequences of different lengths or divergence (Anisimova et al. 2013). In phylogenetic analysis, the long branch length shown by Eu-32 in Fig. 4 (1.3829) is attributed to the fact that DszA sequence of isolate Eu-32 is highly divergent as compared to other Rhodococcus species sequences. Consistent with the dszA gene of Eu-32 are having been acquired through horizontal gene transfer (HGT), as discussed by Denis-Larose et al. 1997 and Kirimura et al. 2004. The plasmid nature of the dsz genes (Piddington et al. 1995; Denome et al. 1994; DenisLarose et al. 1997; Oldfield et al. 1997) would facilitate successful transfer of these genes through HGT.

In conclusion the well characterized dszABC operon of Rhodococcus erythropolis IGTS8 strain is not found in our Rhodococcus Eu-32 strain with its novel desulphurization pathway. The divergence of the Eu-32 dszA gene from the well documented strain IGTS8 dszA gene and the close relatedness of the Eu-32 dszC gene to strain IGTS8 dszC gene probably reflect the events of HGT in the acquisition of desulphurization enzymes of Eu-32. A homologue to the dszB gene of Rhodococcus IGTS8 was not found in Eu-32 at all. This may be related to the novel 4S desulphurization pathway employed by Eu-32 (Fig. 5) in which HPBS is desulphurized to 2-HBP in a reaction catalysed by dszB and then uniquely to Eu-32 further dehydroxylated to biphenyl. This suggests that the DszB enzyme is significantly different to other DszB enzymes previously described in biodesulphurization pathways. This hints at considerable diversity yet to be found in such pathways. Future studies will concentrate on fully characterising the activities of the dszA and C enzymes from Eu-32 and identification of the enzymic activities responsible for the formation of 2-HBP and biphenyl from HPBS i.e. the still missing dszB functionality.