Introduction

Erythritol (1,2,3,4-butanetetrol) is a four-carbon sugar alcohol that is widely distributed in nature. It has been detected in fruits, fermented foods, drinks as well as in the body fluids of mammals [32], and it can be decomposed by bacteria, mushrooms, and yeast [33]. Erythritol has about 70 % of the sweetness of sucrose in a 10 % (w/v) solution. The safety of erythritol has been demonstrated consistently by both animal toxicological and clinical studies, even when consumed on a daily basis in high amounts [32]. More than 90 % of ingested erythritol is not metabolized by the human body and is excreted unchanged in the urine without changing blood glucose and insulin levels [39]. Besides, it has been reported that erythritol acts as an antioxidant in vivo and may help protect against hyperglycemia-induced vascular damage [7]. Therefore, erythritol can be used as a functional sugar substitute in special foods for people with diabetes and obesity [39, 41].

Erythritol production from natural sources such as fruits and vegetables is not practical because of their relatively low contents. Erythritol can be synthesized from dialdehyde starch by chemical reaction at high temperature in the presence of a nickel catalyst. However, this process did not become industrialized because of its low efficiency and high expense [32, 39]. In 1950, Binkley and Wolfrom [3] were the first to suggest that erythritol could be produced by yeast, because traces of erythritol were detected in the residue of fermented Cuban blackstrap molasses. Subsequently, erythritol production by yeasts and yeast-like fungus was reported [10, 49, 50]. Since then much attention has focused on screening and breeding erythritol-producing microorganisms and optimization of the fermentative process [6, 11, 1317, 21, 22, 27, 33, 46, 48]. Erythritol is now produced commercially by fermentation of osmophilic yeasts. The major erythritol-producing microorganisms reported are strains from Pseudozyma tsukubaensis [14], Trigonopsis variabilis [16], Moniliella [2931], Ustilaginomycetes [11], Trichosporon [37, 38], Yarrowia lipolytica [42, 43], Penicillium [27], Moniliella tomentosa var. pollinis [4, 5], Torula [15, 21, 22, 34], Pichia [6], Candida [48], Candida magnoliae [17, 44, 46, 52, 53], Trichosporonoides megachiliensis [35], and Aureobasidium [12, 13]. Although the productivity of erythritol could be increased by different production methods such as batch and fed-batch operation [4, 5, 37, 43], the newly isolated strains would still be competitive in the industrial production. Therefore, screening and development of robust and novel microbial strains still play a key role in erythritol production [32].

The erythritol in yeast was found to be synthesized from erythrose-4-phosphate, an intermediate of the pentose phosphate cycle, by dephosphorylation followed by reduction of the resultant erythrose. Erythrose reductase (ER), which catalyzes the last step, is a key enzyme in the biosynthesis of erythritol [32, 47]. Although a lot of research has been done on the purification and characterization of ERs [12, 2326, 36], the corresponding gene for ERs still remains unclear for most erythritol-producing yeasts. To the best of our knowledge, the ER genes were identified only in Trichosporonoides megachiliensis SNG-42 [35] and Candida magnoliae JH110 [20] till now.

Recently, we isolated a high erythritol-producing microorganism named strain BH010 from honey. According to our previous study, after shaking at 150 rpm for 9 days at 30 °C in flasks utilizing a medium consisting of 30 % glucose, 1 % yeast extract, and 0.2 % calcium chloride with 1 % of inoculum, an erythritol productivity of 110.61 g/l was determined using high-performance liquid chromatography with an NH2 column (4.6 × 250 mm; Macherey–Nagel Inc., Düren, Germany) and a reflective index detector (RID-10A; Shimadzu, Tokyo, Japan). The application of this strain for industrial-scale erythritol production is promising after appropriate breeding and some other treatment.

In this study, we report the morphology and physiological characteristics of this newly isolated strain, and its taxonomic position is discussed by analysis of ribosomal DNA sequences, which is helpful to gain insight into the taxonomy of erythritol-producing yeasts. The genetic sequences of the ERs from strain BH010 are provided and this is the first report of the existence of introns in ER genes. This study is helpful to gain insight into the mechanism of erythritol generation and may offer a potential opportunity to produce erythritol in microorganisms that do not otherwise do so.

Materials and methods

Microorganism and cultivation

The microorganism examined in this study was isolated from a honey sample. All the medium components used in this study were reagent pure grade purchased from Jiangtian Chemical Technology Co., Ltd. (Tianjin, China).

The microorganism was activated by transferring single colonies of the strain from plates to 10 ml activation medium consisting of 20 % glucose and 1 % yeast extract in 50-ml flasks. The flasks were shaken at 30 °C, 150 rpm for 48 h.

Ribosomal DNA sequence analysis

The genomic DNA of strain BH010 was used as a template. The D1/D2 domain of the 26S ribosomal DNA sequence was amplified and sequenced using universal primers NL1 (5′-GCATA TCAAT AAGCG GAGGA AAAG-3′) and NL4 (5′-GGTCC GTGTT TCAAG ACGG-3′) [9, 28]. A partial 18S ribosomal DNA sequence was amplified and sequenced using universal primers EF3 (5′-TCCTC TAAAT GACCA AGTTT G-3′) and EF4 (5′-GGAAG GGRTG TATTT ATTAG-3′) [1]. The internal transcribed spacer and 5.8S rDNA (ITS/5.8S rDNA) sequence was amplified and sequenced using universal primers ITS1 (5′-TCCGT AGGTG AACCT GCGG-3′) and ITS4 (5′-TCCTC CGCTT ATTGA TATGC-3′) [28]. The amplification was conducted by polymerase chain reaction in a PCR thermal cycler (MyCycler, Bio-Rad Laboratories Inc., USA). The amplified sequences were purified and sequenced in both directions by Sangon Biotech Co., Ltd. (Shanghai, China).

Scanning electron microscopy

Samples were fixed for 2 h on microscopic glasses with 2.5 % glutaraldehyde/0.1 M phosphate buffer (pH 7.2). Fixed cells were rinsed with 0.1 M phosphate buffer (pH 7.2), dehydrated in graded ethanol, freeze-dried, sputtered with gold–palladium, and observed under a scanning electron microscope (XL-30 TMP, Philips, the Netherlands).

Physiological characterization

Physiological characterization of strain BH010 was conducted following the conventional methods of Barnett et al. [2].

Amplification of DNA and cDNA for ERs

Four couples of gene-specific primers were designed according to the ER genes reported before to amplify the genes covering the open reading frames of ERs with 5′ and 3′ flanking regions. The primers ERT1 (5′-ATGTC CTACA ACAAG AACAT CCC-3′) and ERT2 (5′-GTATA AGAGC ACATT AAGCG TTAAT-3′) were designed on the basis of er1 encoding ER1 from T. megachiliensis SNG-42 (GenBank accession number AB191474). The primers ERT3 (5′-ATGTC CTACA ACAAG AACAT CC-3′) and ERT4 (5′-GTATA AGAGC ACATT AAGCG TTA-3′) were designed on the basis of er2 encoding ER2 from T. megachiliensis SNG-42 (GenBank accession number AB191475). The primers ERT5 (5′-ATGTC TTACA AACAG TACAT CCC-3′) and ERT6 (5′-ACTGA ACTCA AAGGT TGGTG TTA-3′) were designed on the basis of er3 encoding ER3 from T. megachiliensis SNG-42 (GenBank accession number AB191476). The primers ERC1 (5′-ATGTC TTCGA CCTAC ACCCT TA-3′) and ERC2 (5′-CTTCA CCGTC TTGCT AGCGC-3′) were designed on the basis of the gene of erythrose reductase from C. magnoliae JH110 (CmER) (GenBank accession number FJ550210). The reverse transcription reaction was performed with 2 μg mRNA, 0.5 μg oligo(dT)18 primer (Sangon Biotech Co., Ltd., Shanghai, China) and M-MuLV reverse transcriptase (Sangon Biotech Co., Ltd., Shanghai, China) at 37 °C for 60 min. Using the DNA or the cDNA of strain DH010 as a template, the PCR reaction was performed separately in 35 thermal cycles of 95 °C for 1 min, 56 °C for 1 min, and 72 °C for 1.5 min with Taq DNA polymerase (Fermentas International Inc., Shenzhen, China).

Cloning of DNA and cDNA for ERs

The PCR product was gel-purified and ligated to pUCm-T vector (Sangon Biotech Co., Ltd., Shanghai, China) between the BamHI recognition site and NcoI recognition site to facilitate DNA sequencing. After transformation into competent cells of Escherichia coli DH5α, the recombinants were selected in LB agar plates (10 g/l peptone, 5 g/l yeast extract, and 10 g/l NaCl, pH 7.4) supplemented with ampicillin. Further identification was conducted through a colonial PCR reaction using the corresponding primers. Recombinant plasmids were extracted and digested with restriction enzyme BamHI (Fermentas International Inc., Shenzhen, China) and NcoI (Fermentas International Inc., Shenzhen, China). The positive clones were sequenced in both directions with the sequencing primers in pUCm-T vector. Each of the putative ER genes was sequenced by at least six replicates through being cloned into pUCm-T vector.

Sequence analysis

The nucleotide sequence similarity was analyzed with the BLAST algorithm provided by the National Center for Biotechnology Information (NCBI). The deduced amino acid sequences were analyzed with Primer Premier 5 software. The ClustalW2 program provided by the European Molecular Biology Laboratory–European Bioinformatics Institute (EMBL-EBI) was used to conduct the multiple sequence alignment of nucleotide sequences and the deduced amino acid sequences of ERs were aligned. A phylogenetic tree was constructed with MEGA 4.0 software [51] by the neighbor-joining analysis [45]. Bootstrap analysis was conducted with 2,000 replicates. All the aldo–keto reductases sequences analyzed were obtained from GenBank database and SWISS-PROT database.

Results and discussion

Ribosomal DNA sequence analysis

The D1/D2 domain of the 26S rDNA sequence and the ITS/5.8S rDNA sequence were analyzed with the NCBI BLAST algorithm, and the top eight homologous strains are listed in Tables 1 and 2, respectively.

Table 1 BLAST analysis of the D1/D2 domain of the 26S rDNA sequence from strain BH010
Table 2 BLAST analysis of the ITS/5.8S rDNA sequence from strain BH010

Kurtzman and Robnett [18] analyzed the extent of divergence in the D1/D2 domain of the 26S rDNA from approximately 500 species of ascomycetous yeasts and indicated that the intraspecific sequence variation was less than 1 %, whereas the interspecific sequence variation was over 1 %. Although an interspecific sequence variation of over 1 % was observed in Clavispora lusitaniae [19], the sequence variation of the D1/D2 domain of the 26S rDNA within one species generally ranged from 0 to 1 % for most yeast species [28]. The D1/D2 domain of the 26S rDNA sequence of strain BH010 showed a high level of identity (over 99 %) with several strains from T. madida, M. pollinis, and T. megachiliensis (Table 1). BLAST analysis of the ITS/5.8S rDNA sequence showed a rather low level of identity with other species (Table 2). The 18S rDNA sequence BLAST result hit 102 different strains from various genera with identity around 88 % (data not shown), which indicated that the 18S rDNA sequence of strain BH010 was more conservative than the D1/D2 domain of the 26S rDNA and the ITS/5.8S rDNA. Similar results were observed by Guého et al. [8].

Rosa et al. [40] analyzed nucleotide sequences from the D1/D2 domains of the large-subunit rDNA and phenotypic characteristics, and indicated that the genera Moniliella and Trichosporonoides are members of a single, monophyletic clade that would be best represented by a single anamorphic genus. On the basis of taxonomic priority, Rosa et al. [40] proposed the transfer of the five species of the genus Trichosporonoides including T. madida and T. megachiliensis to the genus Moniliella. On the basis of BLAST analysis of the D1/D2 domain of the 26S rDNA sequence and Rosa et al.’s proposal [40], we confirmed that strain BH010 belongs to the Moniliella clade. Therefore, the strain was named Moniliella sp. BH010 and the D1/D2 domain of the 26S rDNA sequence was submitted to the GenBank with accession number JQ798180. Although BLAST analysis of the 18S rDNA sequence and the ITS/5.8S rDNA sequence did not offer referential results, the sequences still carry a lot of genetic information about strain BH010 for molecular identification. The 18S rDNA and ITS/5.8S rDNA sequences were submitted to GenBank with accession numbers JQ798179 and JQ798181, respectively.

Morphological characteristics

Cultivated on 20 % glucose, 1 % yeast extract, and 2 % agar plates at 30 °C for 3 days, the colonies were circular with an average diameter of 3 mm. The colonies were opaque, milky white, and slightly raised over the agar surface in a convex manner. The colonies had smooth edges, matte surfaces, and moist and rich aroma of fermentation (data not shown). Scanning electron microscopy was conducted for further observation, and two typical micrographs are shown in Fig. 1. After cultivating in liquid medium consisting of 20 % glucose and 1 % yeast extract at 30 °C, 150 rpm for 48 h, the cells were cylindrical to elliptical with an average size of 5 × 10 μm, occurring singly, or in short chains (Fig. 1a). After cultivating on plates consisting of 20 % glucose, 1 % yeast extract, and 2 % agar at 30 °C for 48 h, pseudohyphae were present, occasionally with blastoconidia (Fig. 1b).

Fig. 1
figure 1

Scanning electron micrograph of strain BH010 cultivated in a liquid medium consisting of 20 % glucose and 1 % yeast extract at 30 °C, 150 rpm for 48 h; b plates consisting of 20 % glucose, 1 % yeast extract, and 2 % agar at 30 °C for 48 h (×1,000, scale bar 20 μm)

Physiological characteristics

Physiological characteristics of strain BH010 including sugar fermentation, carbon source assimilation, nitrogen source assimilation, and some other physiological tests are demonstrated in Table 3.

Table 3 Physiological characteristics of strain BH010

Cloning of DNA and cDNA for ERs

Agarose gel electrophoresis results of PCR amplification using the DNA and cDNA of strain DH010 as a template are shown in Figs. 2 and 3, respectively. According to Figs. 2 and 3, PCR amplification of the putative CmER gene turned completely negative, whereas that of putative er1, er2, and er3 afforded single bands which might suggest the presence of er1, er2, and er3 in strain BH010. The proteins obtained from T. megachiliensis SNG-42 encoded by genes er1, er2, and er3 were ER1, ER2, and ER3, respectively [35]. Therefore, in order to avoid ambiguity and to facilitate further analysis, the putative ER1, ER2, and ER3 in strain BH010 were named MsER1, MsER2, and MsER3, respectively. MsER is an abbreviation of erythrose reductase in Moniliella sp. BH010. The PCR product was then gel-purified, ligated to pUCm-T vector, and transformed into competent cells of E. coli DH5α to facilitate DNA sequencing.

Fig. 2
figure 2

Agarose gel electrophoresis of putative ER genes from stain BH010 amplified by PCR reaction: lane 1 putative er1, lane 2 putative er2, lane 3 putative er3, lane 4 putative CmER gene

Fig. 3
figure 3

Agarose gel electrophoresis of PCR reaction result using the cDNA of strain DH010 as a template: lane 1 putative CmER gene, lane 2 putative er1, lane 3 putative er2, lane 4 putative er3

Comparison of putative ER genes in strain BH010 with other ER genes

The positive clones were sequenced in both directions with the sequencing primers in pUCm-T vector. All the obtained DNA and cDNA sequences started with an ATG initiation condon, covered the whole open reading frame, and had a termination condon. To investigate the existence of introns in ER genes from strain BH010, the DNA sequences and the cDNA sequences were aligned. The result showed that a 67-bp intron starting with GT and ending with AG (Fig. 4a) existed in MsER1 gene at the position 339 nucleotides downstream from the ATG initiation condon. Similarly, a 100-bp intron starting with GT and ending with AG (Fig. 4b) existed in MsER3 gene at the position 339 nucleotides downstream from the ATG initiation condon as well. Lee et al. [20] reported the absence of intron in ER gene from C. magnoliae JH110. Ookura et al. [35] amplified ER genes of T. megachiliensis SNG-42 with a RT-PCR reaction using mRNA as a template and introns were not mentioned. This is the first report about the existence of introns in ER genes. Therefore, to understand whether introns affect the expression of ERs encoded by ER genes or not, much more further research need to done.

Fig. 4
figure 4

Introns in ER genes in strain BH010: intron in a MsER1 gene and b MsER3 gene

The DNA sequences were aligned utilizing the ClustalW2 program provided by EMBL-EBI, and the result indicated that the obtained MsER1 and MsER2 genes were identical (100 % identity). This can be explained by the fact that er1 shared a high homology (95.4 %) with er2 in the coding region [35]. The amplification primers for the putative MsER1 and MsER2 sequences were designed on the basis of er1 and er2, and were not specific. Further analysis by the BLAST program indicated that the obtained MsER1 and MsER2 DNA sequences were identical to gene er1 from T. megachiliensis SNG-42 except for the existence of a 67-bp intron. The result suggested that the putative MsER2 gene, namely the putative ER2 gene in Moniliella sp. BH010, was not detected in this study. However, we cannot exclude the possibility that the MsER2 gene might be present in strain BH010. A BLAST analysis of the MsER3 DNA sequence obtained from strain BH010 indicated that it shared high homology (98.5 %) with er3 from T. megachiliensis SNG-42 except for the existence of a 100-bp intron. The MsER1 and MsER3 genes were submitted to the Genbank database with the accession numbers JQ798182 and JQ798183, respectively.

Comparison of deduced amino acid sequences of putative ERs in strain BH010 with other aldo–keto reductases

The intronless MsER1 gene had a 987-bp open reading frame encoding a 328-amino-acid protein, whereas the intronless MsER3 gene had a 993-bp open reading frame encoding a 330-amino-acid protein. Since the MsER1 gene was completely identical to er1 from T. megachiliensis SNG-42, the amino acid sequence encoded by the MsER1 gene was identical to that of er1. The multiple sequence alignment of the deduced amino acid sequence of MsER3 gene with other ERs is shown in Fig. 5. The deduced amino acid sequence of the MsER3 gene shared high homology (99 %) with that of er3 from T. megachiliensis SNG-42. The two sequences were identical, except for the 62nd (Lys↔Gln), the 25th (Val↔Ala), and the 317th (Ala↔Ser) amino acids which are marked with boxes in Fig. 5.

Fig. 5
figure 5

Multiple sequence alignment of the deduced amino acid sequence for MsER3 gene with other ERs. ERs are identified by the GenBank accession numbers. a T. megachiliensis SNG-42 ER1 (BAD90687); b T. megachiliensis SNG-42 ER2 (BAD90688); c T. megachiliensis SNG-42 ER3 (BAD90688); d Moniliella sp. BH010. ER3 (JQ798183); e C. magnoliae JH110 ER (FJ550210)

The deduced amino acid sequences of the MsER3 and MsER1 genes were compared with other protein sequences in the NCBI database utilizing the BLASTP program. The putative MsER1 and MsER3 showed significant homology to the aldo–keto reductase superfamily, a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. MsER3 exhibited the highest levels of identity with ERs (87–99 % identity) from T. megachiliensis SNG-42 ERs (BAD90688; BAD90687; BAD90689), aldehyde reductase 1 (57 % identity) from Sporidiobolus salmonicolor IAM 12258 (P27800), and an amino acid sequence related to GCY1-galactose-induced protein of aldo/keto reductase family (57 % identity) from Sporisorium reilianum SRZ2 (CBQ68613). MsER3 was also highly homologous to hypothetical protein sequences from Malassezia globosa CBS7966 (56 % identity; XP_001728849) and from Schizophyllum commune H4-8 (54 % identity; XP_003030734) which were annotated as aldo–keto reductases. A phylogenetic tree was constructed utilizing the deduced amino acid sequences of the MsER3 and MsER1 genes and full length amino acid sequences of aldo–keto reductases from various sources (Fig. 6). In the phylogenetic tree, MsER1 and MsER3 were close to the ERs from T. megachiliensis SNG-42, aldehyde reductase 1 from Coprinopsis cinerea, xylose reductase from Rhodotorula mucilaginosa, and aldehyde reductase 1 from Sporidiobolus salmonicolor. MsER1 and MsER3 had very close evolutionary relation with the ERs from T. megachiliensis SNG-42 which are believed to be members of the yeast aldo–keto reductase subfamily (AKR3B) [35], and can be statistically considered as members of the yeast aldo–keto reductase subfamily as well.

Fig. 6
figure 6

Phylogenetic analysis of the ERs from Moniliella sp. BH010. The phylogenetic tree was constructed utilizing the full length amino acid sequences of aldo–keto reductases from various sources including the erythritol reductases reported. Aldo–keto reductases are identified by the corresponding GenBank accession numbers. The abbreviations MsER1 and MsER3 refer to erythritol reductase 1(GenBank accession no. JQ798182) and erythritol reductase 3 (GenBank accession no. JQ798183) from Moniliella sp. BH010, respectively

Conclusion

A newly isolated yeast (strain BH010) which produced a high yield of erythritol (110.61 g/l) was reported in this study. The D1/D2 domain of the 26S rDNA sequence, the ITS/5.8S rDNA sequence, and the 18S rDNA sequence were determined and analyzed with the NCBI BLAST algorithm. On the basis of the rDNA sequence analysis, strain BH010 was identified as a member of the Moniliella clade and was named Moniliella sp. BH010. Physiological characteristics were described. Scanning electron micrography indicated that the cells were cylindrical to elliptical with an average size of 5 × 10 μm when growing in liquid medium, and that pseudohyphae and blastoconidia were observed when cultivated in agar plate. The genes corresponding to ERs of strain BH010 were isolated and cloned. Comparing the obtained sequences with other ER genes in the NCBI database, the data suggested that the newly obtained ER genes shared very high homology with ER genes from another erythritol-producing yeast T. megachiliensis SNG-42 except for the presence of intron. The deduced amino acid sequences were compared with other protein sequences in the NCBI database, and the data demonstrated that MsER1 and MsER3 were significantly homologous to the aldo–keto reductase superfamily. The report of ER genes is helpful to gain insight into the mechanism of erythritol generation and may offer a potential opportunity to develop erythritol-producing microbial systems through recombination of ER genes.