Introduction

Mannan and heteromannan, after xylan, are the second most abundant hemicellulosic polysaccharides, together with cellulose and xylan, accounting for more than two- thirds of all renewable organic carbon resources in nature (Sandgren et al. 2001; Stalbrand 2003; Zhao et al. 2010). The complete biodegradation of mannan requires the synergistic action of several mannolytic enzymes, such as β-mannanases, exo-β-mannanases and β-mannosidases (Yoon and Lim 2007). β-Mannanases (β-1,4-d-mannan mannohydrolases, EC 3.2.1.78), which exist widely in various organisms such as bacteria, fungi, actinomycetes, plants and molluscs, can catalyze the random cleavage of β-1,4-d-mannosidic linkages of mannans and heteromannans to form mannooligosaccharides, which are suitable substrates for other mannolytic enzymes (Dhawan and Kaur 2007). To date, almost all known β-mannanases have been grouped into GH families 5, 26 and 113 based on their primary structure alignment and hydrophobic cluster analysis (http://www.cazy.org/) (Zhang et al. 2008). According to their three-dimensional (3-D) structure analysis, β-mannanases in the three families all belong to GH clan-A (Le Nours et al. 2005; Zhang et al. 2008).

During the past decades, many β-mannanase genes have been cloned and characterized from microorganisms and expressed in heterologous cells, such as Phialophora sp. P13 (Zhao et al. 2010), Bispora sp. MEY-1 (Luo et al. 2009), A. sulphureus MAFIC001 (Chen et al. 2007), A. niger BK01 (Cuong et al. 2009), A. aculeatus MRC11624 (Roth et al. 2009) and Bacillus sp. JAMB-750 (Hatada et al. 2005). Meanwhile, β-mannanases have been applied in diverse industrial bioprocesses, such as biobleaching pulps, depolymerizing an anti-nutritional factor, mannan, in feedstuffs, extracting oils from leguminous seeds and reducing the viscosity of coffee extracts, hydrolyzing mannan-based polymers in hydraulic fracturing of oil and gas wells, and the production of mannooligosaccharides (Luo et al. 2009). β-Mannanases played important roles in simplifying the industrial processes and improving the quality of products while reducing the environmental pollution caused by using the chemicals (Dhawan and Kaur 2007). However, the commercialization and broad applications of β-mannanases are still limited by their low activities and expensive production costs. Therefore, exploration and identification of novel β-mannanases with good properties and high activities are highly desirable.

In our previous studies, a filamentous fungus of A. usamii YL-01-78, which can produce an acidophilic β-mannanase (AuMan5A) with higher activity by solid-state fermentation, was isolated from the soil in China and preserved in our lab as reported previously (Li et al. 2006a). The partial enzymatic properties of the purified A. usamii AuMan5A were characterized, which possesses some crucial properties including the superior V max and chemical tolerance (Li et al. 2006b). In this work, we reported the cloning of the full-length cDNA and complete genomic copy of the gene that encodes the AuMan5A. Moreover, the bioinformatics analysis of the Auman5A and AuMan5A sequences was also described. To our knowledge, this is the first report on the cloning and bioinformatics analysis of a novel acidophilic β-mannanase gene, Auman5A, from A. usamii YL-01-78.

Materials and methods

Strains, media and vector

A. usamii YL-01-78, isolated from the soil in China as reported previously (Li et al. 2006a) and preserved in the lab of biochemistry and molecular biology, School of Medicine and Pharmaceutics, Jiangnan University, China, was used to extract the total RNA and genomic DNA. The strain was cultured in a liquid medium containing 1.0% of tryptone, 0.5% of yeast extract, 1.0% of dextrose and 0.5% of locust bean gum (Sigma, St. Louis, MO, USA); pH 6.0. E. coli JM109 (TaKaRa, Dalian, China), used as a host strain for gene cloning and DNA sequencing, was grown in a Luria–Bertani medium (Sambrook and Russell 2001). The pUCm-T vector (Sangon, Shanghai, China) was used for the vector-mediated PCR amplification, a novel PCR technique originally developed in our lab, as well as for the directly cloning of PCR products.

Total RNA and genomic DNA extraction

A. usamii YL-01-78 was cultivated in above mentioned medium at 32°C for 24 h on a rotary incubator with 220 rev/min. The mycelia were harvested through filtration, and thoroughly washed with sterile deionized water. The total RNA was extracted from the mycelia using the one-step method according to the instruction of TRIzol Kit (Sangon, Shanghai, China). Analytical results of the extracted total RNA showed that the ratio of OD260 to OD280 was 1.98, and the 18S and 28S rRNA bands, characterized as eukaryotes, on formaldehyde denatured agarose gel electrophoresis were specific, indicating that the total RNA has high purity and is not decomposed. Extraction of the genomic DNA from A. usamii YL-01-78 was performed according to the method as reported previously (Wu et al. 2011).

Enzyme activity and protein assays

The β-mannanase activity was quantitatively assayed by measuring the amount of released reducing sugar from locust bean gum (Sigma) using the 3,5-dinitrosalicylic acid (DNS) method as reported previously (Li et al. 2006a). One unit (U) of β-mannanase activity was defined as the amount of enzyme librating 1 μmol of reducing sugar per min under the assay conditions, using d-mannose as standard. Protein concentration was measured by a BCA-200 Protein Assay Kit (Pierce, Rockford, IL), using bovine serum albumin as standard. Sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) was performed on a 12.5% gel using the reported method (Laemmli 1970).

Purification and deglycosylation of the AuMan5A

The cultivated koji (10 g) of A. usamii YL-01-78 was extracted with 10 volumes (w/v) of 20 mmol/l phosphate buffer (pH 7.0) at 30°C for 30 min with shaking at 100 rev/min. The crude extract was brought to 75% saturation by adding solid ammonium sulfate. The precipitate was collected by centrifugation and then dissolved in 10 ml of the same phosphate buffer. Subsequent manipulations were performed according to the reported method (Li et al. 2006b). The carbohydrate content of the purified AuMan5A was determined by the phenol sulfuric acid method (Dubois et al. 1956), using d-mannose as standard. The glycoprotein was deglycosylated by denaturing the purified AuMan5A at 100°C for 10 min, and then endoglycosidase H (Endo H, New England Biolabs) was added to perform deglycosylation at 37°C for 1 h. The manipulation followed the manufacture’s instructions.

Primers for PCR amplification

After aligning three sequences of the GH family 5β-mannanases from fungi, A. aculeatus (AAA67426), A. sulphureus (ABC59553) and T. reesei (AAA34208), we found that there are two peptide fragments with the highest identity, GYFAGTNS (/C)YW and STINTGADGLQ, located in the N-terminal region. Two degenerate primers manF1 and manF2 were designed corresponding to GYFAGTN and STINTGA, respectively. Primers dT-PR and PR (original names, Oligo dT-M13 Primer M4 and M13 Primer M4, respectively) provided by the RNA PCR Kit (TaKaRa, China), as well as manF1 and manF2 were used for the cloning of 3′-end cDNA fragment. Primers OP and IP (5′ RACE Outer Primer and 5′ RACE Inner Primer, respectively) provided by 5′-Full RACE Kit (TaKaRa, China) together with manR1 and manR2 were used to clone the 5′-end cDNA fragment. Primers FmanF and FmanR were used to amplify the full-length cDNA sequence or the central region of the Auman5A. Using pUCm-T vector-mediated PCR method, the 5′ flanking regulatory region of the Auman5A was amplified with T-5frrF (identical to the 21-bp fragment upstream the T/A clone site of pUCm-T vector), manR1 and manR2, while the 3′ flanking regulatory region with manF3, manF4 and T-3frrR (complementary to the 20-bp fragment downstream the T/A clone site). CmanF and CmanR were used for the cloning of the complete DNA sequence. As listed in Table 1, all primers (except those provided by Kits) were synthesized by Sangon (Shanghai, China).

Table 1 The sequences of primers for PCR amplification

Cloning of the full-length cDNA of the gene Auman5A

The 3′-end fragment of AuMan5A cDNA was amplified by using RNA PCR Kit and nest PCR technique. The dT-PR was used as primer for reverse transcription of the first-strand cDNA. Using the resulting first-strand cDNA as template, the first-round PCR amplification was performed with manF1 and PR as following conditions: an initial denaturation at 94°C for 2 min; 30 cycles of at 94°C for 30 s, 53°C for 30 s, 72°C for 90 s; an extra elongation at 72°C for 10 min. And then the second-round PCR amplification was carried out with manF2 and PR for confirmation (nest PCR). Next, the 5′-end fragment, originating from the starting point of transcription, was amplified by using 5′-Full RACE Kit and nest PCR method. The first-strand cDNA was used as template for the first-round PCR amplification with OP and manR1, then subjected to the second-round PCR amplification with IP and manR2 for confirmation.

Finally, the full-length cDNA sequence either was obtained by assembling above cloned 3′- and 5′-end cDNA fragment sequences or directly amplified by conventional PCR with FmanF and FmanR using the first-strand cDNA as template.

pUCm-T vector-mediated PCR technique

The pUCm-T vector-mediated PCR amplification, a novel method initially developed in our lab to amplify the 5′ or 3′ flanking region of a known DNA sequence, was carried out by four steps, as flowcharted in Fig. 1 (exemplified as the cloning of a 5′ flanking region). Firstly, the genomic DNA was digested using two optimum restriction enzymes, which should be selected through a series of experiments. In this work, BamHI/EcoRV and PstI/EcoRV were selected for the cloning of 5′ and 3′ flanking regulatory regions, respectively. Secondly, the cohesive end(s) or blunt end(s) were filled in and added an adenine nucleotide (A) at 3′-ends using Ex Taq DNA polymerase at 72°C for 10 min. The third step was to ligate the second step’s products into pUCm-T vector. And finally, the recombinant plasmids were first amplified with T-5frrF and manR1 to obtain the 5′flanking regulatory region, and then subjected to the second-round PCR amplification with T-5frrF and manR2 for confirmation.

Fig. 1
figure 1

Flowchart of the pUCm-T vector-mediated PCR and nest PCR amplification methods (exemplified as the cloning of a 5′ flanking region of the gene Auman5A)

Cloning of the complete DNA of the gene Auman5A

The central region of the Auman5A was amplified from the genomic DNA of A. usamii YL-01-78 by conventional PCR with FmanF and FmanR. Next, the 5′ or 3′ flanking regulatory region was amplified by using pUCm-T vector-mediated PCR mentioned above. Finally, the complete DNA sequence either was obtained by assembling above three respectively cloned DNA fragment sequences or directly amplified by conventional PCR with CmanF and CmanR.

Nucleotide sequence accession number

The complete DNA sequence of the Auman5A along with its deduced amino acid sequence has been deposited in the GenBank database under the accession number HQ839639. To each cloned cDNA or DNA fragment, three independent clones were picked out for DNA sequencing. The results were adopted as three inserted cDNA or DNA fragments were identical with one another, or else the experiment was redone.

Analysis of amino acid and DNA sequences

The SignalP 3.0 (http://www.cbs.dtu.dk/services/SignalP/) was applied to predict the signal peptide of the AuMan5A. The putative N-glycosylation site was located using the NetNGlyc program 1.0 (http://www.cbs.dtu.dk/services/NetNGlyc/). Physico- chemical properties were identified using the Protparam (http://au.expasy.org/tools/protparam.html). The homology alignment of the primary structures between the AuMan5A and other GH family 5 β-mannanases was carried out in GenBank using the BLAST program. The 3-D structure of the AuMan5A was modeled using the Pred3D Web Server 1.0 (http://www.fundp.ac.be/sciences/biologie/urbm/bioinfo/esypred/), based on a crystal structure of the T. reesei RutC-30 β-mannanase from the GH family 5. The ORF was determined using the program of NCBI ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). The GeneMark (http://opal.biology.gatech.edu/GeneMark/eukhmm.cgi) was used for the localization of the exon/intron boundaries. The Berkeley Drosophila Genome Project (http://www.fruitfly.org/seq_tools/promoter.html) and the PLACE (http://www.dna.affrc.go.jp/PLACE/signalscan.html) were used to predict the promoter region and its characterization.

Results and discussion

Deglycosylation analysis and N-terminus sequencing

SDS–PAGE analysis of the purified AuMan5A showed a single protein band of about 42.0 kDa, an apparent M.W. of the AuMan5A. The carbohydrate content of the AuMan5A was determined to be 23.2% using the phenol sulfuric acid method. After the AuMan5A was treated with Endo H, only a single protein band of about 37.5 kDa was observed on SDS–PAGE, which is much smaller than an apparent M.W. of the AuMan5A. These results strongly confirmed that the AuMan5A is a glycoprotein. The N-terminal amino acid sequence of the AuMan5A was analyzed on a 470A automatic sequencer of Applied Biosystems (Foster, CA, USA). The sequence of N-terminal 15 amino acid residues was determined to be SFASTSGLQFTIDGE, which is identical to that of the A. sulphureus β-mannanase (Chen et al. 2007).

Cloning and analysis of the full-length cDNA sequence

An approximate 1.2-kb 3′-end cDNA fragment and another faint band were amplified. Each band was agarose gel-purified and subjected to the second-round PCR for confirmation. Only the 1.2-kb fragment can be amplified again, and was ligated into pUCm-T vector. DNA sequencing result verified that the inserted cDNA fragment is exactly 1,214 bp in length (except primer dT-PR). An about 450-bp 5′-end cDNA fragment was amplified as a major PCR product, and then subjected to the second-round PCR for confirmation. DNA sequencing result showed that the first-round major PCR product is exactly 404 bp in length (except primers OP and IP), containing a 191 bp of sequence identical to that between manF1 and manR1, and a new 213 bp of sequence in which there are a starting point of transcription (C) and the cDNA fragment coding for a peptide of 38 amino acids and a determined N-terminal 15 amino acids (SFASTSGLQFTIDGE) of the AuMan5A (Fig. 2).

Fig. 2
figure 2

Nucleotide sequence of the cDNA or genomic copy of the gene Auman5A from A. usamii YL-01-78 and its deduced amino acid sequence of the AuMan5A. Two introns with 63 and 60 bp are shown in lowercase letters. A signal peptide from Met1 to Ala21 and a propeptide from Leu22 to Thr38 are underlined. The determined N-terminal 15 amino acid sequence of the purified AuMan5A is indicated in a grayed box. The bold letters of TATAAA and C in boxes indicate the putative TATA box and the starting point of transcription, respectively. The grayed italic letters of ATG and TAA represent the starting codon and stop codon, respectively. The putative polyadenylation signal, AATTAA, is shown as grayed underlined letters. The bold arrows below the letters represent the primers for PCR amplification

Using the first-strand cDNA as template, an approximate 1.4-kb full-length cDNA sequence was directly amplified with FmanF and FmanR, which is identical to that obtained by assembling cloned 3′- and 5′-end cDNA fragments. It is 1,427 bp in length (except polyA), containing a 51 bp of 5′ non-coding region, a 1,152 bp of ORF coding for a protein of 383 amino acid residues, and a 224 bp of 3′ non-coding region.

Primary structure analysis of the AuMan5A

The SignalP 3.0 predicted an unambiguous signal peptide cleavage site between Ala21 and Leu22, indicating that the AuMan5A is a secretory protein. With the information of the determined N-terminal 15 amino acid residues of the purified AuMan5A, it was probable that the preproAuMan5A of 383 amino acid residues was predicted to contain a 21-aa signal peptide from Met1 to Ala21, a 17-aa propeptide from Leu22 to Thr38 and a 345-aa AuMan5A (Fig. 2). Propeptides also exist in other β-mannanases (Ademark et al. 1998) or microbial enzymes (Sayari et al. 2005; Wakiyama et al. 2008). The M.W. of 37,614 Da which is in agreement with that of the deglycosylated AuMan5A and pI of 4.09 are calculated from the deduced AuMan5A. There are two putative N-glycosylation sites (N156–S157–S158 and N225–F226–T227) in the AuMan5A sequence. Amino acid homology alignment showed that identities of the AuMan5A with other four β-mannanases of A. niger (ACJ06979), A. sulphureus (ABC59553), A. aculeatus (AAA67426) and T. reesei (AAA34208) from GH family 5 were 98.8, 93.9, 73.4 and 46.8%, respectively (Fig. 3). Moreover, two putative catalytic glutamate residues, Glu168 and Glu276 located respectively in the β4 and β7 strands, were found to be highly conserved among GH family 5 members (Zhao et al. 2009). These features strongly implied that the AuMan5A is a member of GH family 5.

Fig. 3
figure 3

Amino acid homology alignment of the AuMan5A with other four β-mannanases from GH family 5 based on the primary structural features determined with the BLAST program in GenBank. Aus Aspergillus usamii (HQ839639), Ani Aspergillus niger (ACJ06979), Asu Aspergillus sulphureus (ABC59553), Aac Aspergillus aculeatus (AAA67426) and Tre Trichoderma reesei (AAA34208). The identical amino acid residues in five β-mannanases are indicated by solid black boxes

Three-dimensional structure analysis of the AuMan5A

Based on the crystal structure of the T. reesei RutC-30 β-mannanase from the GH family 5, we modeled the 3-D structure of the AuMan5A as shown in Fig. 4. The 3-D structure consists principally of the (β/α)8 barrel fold. The structure has been likened to a ‘salad bowl’, with one face of the molecule having a large radius (approximately 45 Å) due to an elaborate loop architecture, while the opposite face, which consists of simple α/β turns, has a radius of approximately 30 Å. This is similar to the fold described for GH family 10 enzymes and both are members of GH clan-A. Indeed, these two families are quite closely related and in addition to sharing a common fold they have the same type of catalytic mechanism and share several common residues (Larson et al. 2003, 2006). The catalytic residues, Glu168 and Glu276, locate in the hydrophobic cleft of the AuMan5A, where the β-1,4-mannosidic linkages of mannan or heteromannan backbone insert and get cleaved.

Fig. 4
figure 4

The three-dimensional structure model of the AuMan5A predicted by using Pred3D Web Server 1.0 based on the crystal structure of the T. reesei RutC-30 β-mannanase (1QNO) from the GH family 5. The 3-D structure consists principally of the (β/α)8 barrel fold. The Glu168 and Glu276 are putative catalytic residues in the AuMan5A

Cloning of the complete DNA sequence

An approximate 720-bp 5′ flanking regulatory region of the Auman5A was amplified, and then cloned into pUCm-T vector, followed by DNA sequencing. Result showed that the inserted 5′ flanking regulatory region is exactly 657 bp in length (except the sequence from the primer T-5frrF to T/A clone site of pUCm-T vector), containing a new 275-bp sequence and a 382-bp sequence identical to that from the starting point of transcription to manR2. An approximate 680-bp 3′ flanking regulatory region of the Auman5A was amplified. DNA sequencing result showed that the inserted 3′ flanking regulatory region is exactly 620 bp in length (except the sequence from T-3frrR to T/A clone site), containing a 277-bp sequence identical to that between manF4 and FmanR and a new 343-bp sequence (Fig. 2). An about 1.6-kb central region of the Auman5A was amplified by conventional PCR with FmanF and FmanR.

Using the genomic DNA as template, an about 2.2-kb complete DNA sequence of the Auman5A was amplified by PCR with CmanF and CmanR. DNA sequencing result showed that the complete DNA sequence directly amplified is entirely identical to that obtained by assembling three respectively cloned DNA fragments.

DNA sequence characterization of the gene Auman5A

The complete DNA sequence of the Auman5A is exactly 2,168 bp in length (Fig. 2). Compared with the full-length cDNA sequence, the gene Auman5A is composed of a 275 bp of 5′ flanking regulatory region, a 51 bp of 5′ non-coding region, two short introns with 63 and 60 bp, respectively, a 1,152 bp of ORF, a 224 bp of 3′ non-coding, and a 343 bp of 3′ flanking regulatory region. All of the exon/intron boundaries conform to the canonical GT-AG rule. It was predicted that the promoter region of the gene Auman5A locates at the range from −40 to +10 bp, designating the starting point of transcription (C) as +1 bp. A TATAAA sequence as a classical TATA box locates at −20 bp upstream the starting point of transcription, which is in agreement with the consensus distance generally found in promoter regions. In some cases, other consensus sequences, such as TTATTT, could act as substitutes for the classical TATA box (Reed and Gibson 1994; Lampidonis et al. 2008). In eukaryotes, a functional CAAT box is typically found about −75 bp upstream the starting point of transcription. Some CAAT boxes may locate further from the starting point (Wenkel et al. 2006). However, we did not found any CAAT box in the gene Auman5A until −275 bp. It was also found that a AATTAA sequence as a putative polyadenylation signal locates at +1,510 bp downstream the starting point of transcription. Bioinformatics analysis indicated that the 5′ flanking regulatory region possesses several putative transcription factor binding sites. Four HSF (heat shock factor) binding sites were found at the positions of −32, −50, −74 and −144 bp, respectively; five GATA-1 (GATA-binding factor 1) binding sites at −125, −144, −172, −200 and −223 bp, respectively; one GATA-2 (GATA-binding factor 2) binding site at −144 bp.

Conclusions

The AuMan5A purified from the cultivated koji of A. usamii YL-01-78 possesses some predominant properties including the superior V max, chemical tolerance and high thermostability, which are very suitable for industrial applications (Li et al. 2006b). However, its purification processes were time-consuming and laborious. To simplify the purification manipulations and to produce the AuMan5A inexpensively, it is necessary to express the gene Auman5A in heterologous cells, such as Pichia pastoris GS115, one of the favorite expression hosts.

In this work, we developed a procedure to clone the full-length cDNA and complete genomic copy of the gene Auman5A from A. usamii YL-01-78 by using four steps of PCR amplification based on different principles. In addition, the bioinformatics analysis of the Auman5A and AuMan5A sequences was also reported. Our present work founded a solid basis for further researches on the gene Auman5A expression in P. pastoris GS115 and the relationship between AuMan5A’s structure and function, as well as improvement of its catalytic activity and other enzymatic properties by means of protein engineering, such as the site-directed mutagenesis, directed evolution or computational design.