Introduction

Currently, much attention is given worldwide to the biological and integrated control means of plant fungal diseases, and there is thus need for the biocontrol agents (Bourbos and Skoudridakis 1994). Chaetomium cupreum has considerable potential for the biocontrol of plant pathogens (Manandhar et al. 1998; Soytong 2003). Chaetomium spp. belongs to Ascomycetes, Chaetomiaceae, Chaetomium, normally exists in soil or organic compost (Seonju and Richard 1999). Some species of Chaetomium are very good cellulase and laccase producers and therefore they are important for the biotechnological industry (Ankudimova et al. 1999; Mimura et al. 1999; Chefetz et al. 1998). Agricultural significance of Chaetomium spp. lies in that some strains can suppress plant pathogen infection (Hubbard et al. 1982; Aggarwall et al. 2004; Pietro et al. 1992). C. cupreum also can antagonizes a wide set of plant pathogens like Pyricularia oryzae, Rhizoctonia solani, and Curvularia lunata (Soytong 2003; Yang 2003; Liao 2002). Biocontrol mechanisms of Chaetomium spp. generally are colonizing the plant root or seeds to avoid parasitism of deleterious rhizosphere microorganisms, secreting a variety of degrading enzymes (Inglis and Kawchuk 2002) and antibiotics (Kobayashi et al. 1996; Jiao et al. 2004; Kanokmedhakul et al. 2006), stimulating plant growth, and inducing resistance of plant host (Soytong 2003). Knowledge about the integrated biocontrol mechanism is critical for increasing biocontrol ability of C. cupreum and also as a resource for further research on biocontrol fungi privacy. To obtain a more comprehensive understanding of biocontrol mechanism of C. cupreum, it is important to identify genes expressed with biocontrol function. ESTs analysis has been shown to allow for rapid, large-scale, functional elucidation of genes (Adams et al. 1991). Nowadays, the application of ESTs project on fungus biology focuses on identifying genes expressed in plant pathogens, for example, identifying genes expressed related to mycelial development and pathogenicity of Gibberella zeae (Frances et al. 2003), during the course of Sclerotinia sclerotiorum infecting host (Rugang et al. 2004), involved in the pathogenesis of Phytophthora parasitica (Franck et al. 2005) (Fig. 1). EST studies in biocontrol agents has been carried out on Trichoderma harzianum; of the 1,740 unigene sequences derived from a T. harzianum mycelial cDNA library, 55 genes were identified that may be involved in the process of biocontrol (Liu and Yang 2005). The ESTs approach was also successfully used on T. harzianum strain CECT 2413. Three thousand four hundred seventy-eight unique genes from eight different cDNA libraries were constructed. The study offered a base for expression profiling, enabling the identification of genes involved in specific physiological processes containing biocontrol process (Vizcaino et al. 2006). In this paper, the ESTs expressed in mycelium of C. cupreum were reported. This is the first report which focuses on research of C. cupreum gene expression, which will make a guidance towards the knowledge of the C. cupreum genome and provide potentially useful information for elucidating the molecular mechanisms underlying biocontrol competence as a biocontrol agent against plant pathogens.

Fig. 1
figure 1

The confrontation of C. cupreum with S. sclerotiorum

Materials and methods

Fungal strain

C. cupreum isolate was kindly provided by King Mongkut’s Institute of Technology Ladkrabang of Thailand.

Construction of cDNA library and DNA sequencing

C. cupreum was grown on PD (potato dextrose) medium for 60 h at 27 °C, 140 rpm, then the mycelium was harvested and immediately frozen at −80 °C. Approximately 0.5 g sample were ground to a fine powder in liquid nitrogen and total RNA was extracted with TriZol Reagent (Life Technologies). The mRNA was isolated by means of Oligotex mRNA kits (Qiagen) according to the manufacturer’s instructions. The cDNA double strands fragments with size of above 700 bp were ligated with the vector of pBluescriptII and cloned into the Escherchia coli DH10B. An aliquot at different dilution of the library was incubated on Luria broth medium at 37 °C in the presence of X-gal and IPTG for overnight. Plasmid DNA templates for the ESTs sequencing were random selected from the clones on agar plate and inoculated into 96-well plate with 5 ml culture of Luria broth containing ampicillin for overnight. Plasmid DNA was isolated by an alkaline lysis procedure and templates for DNA sequencing were prepared by specific polymerase chain reaction (PCR) amplification. Fragments of cDNA was sequenced by means of the T3 primers unidirectional with MegaBASE1000 (Amersham Pharmacia Biotech) DNA sequencer.

Data processing and analysis

Raw sequences files were imported a Unix server and changed into FASTA format by the Phred program (Ewing et al. 1998), the Crossmatch program was used to trim vector sequences, sequences containing more than 5% ambiguous base and shorter than 100 bp. Sequences were further assembled by the Phrap program (http://www.phrap.org/) and accuracy of contigs was confirmed with Consed program (Gordon et al. 1998). All unique sequences were compared against the C. globosum ESTs, T. harzianum ESTs deposited in our laboratory that were derived from the cDNA library of mycelium which was grown on PDA medium and other fungi ESTs sequences that were publicly available in the GenBank database using BlastN (E-value <10−5). The functional annotation of ESTs was done to identify putative genes similar in the nonredundant (nr) protein sequence database of GenBank using BlastX algorithm with default parameters and significant similarity was declared when the ESTs with E-value lower than 10−10. Known unigene matches were classified into functional groups according to Gene mapping provided by the GO website with the E-value lower than 10−5 (GO, http://www.geneontology.org/) (Ashburner et al. 2000). The unigenes were selected for KEGG pathway analysis when the length was more than 30 amino acids, the protein similarity was more than 30%, E-value was lower than 10−5 compared to the protein sequence database of KEGG, and then assorted to different metabolism pathways defined by KEGG according to its corresponding enzyme number (Kyoto Encyclopedia of Genes and Genomes, http://www.genome.ad.jp/kegg/) (Kanehisa and Goto 2000). All the sequenced genes have been deposited in the GenBank database with following accession nos.: DV544375–DV547440.

Results

Characterization of the cDNA library

The cDNA library of C. cupreum was generated successfully with the following characteristics: the titer of the unamplified cDNA library was 0.93 × 106 pfu/ml. Blue/white plaques selection indicated the recombinant rate of the library was 94.2%. PCR amplification of the random selected recombinant plaques was used to assess the quality of library. The PCR products showed that all the plaques had inserts and the insert size range from 500 to 3,000 bp, most of them was above 1.2 kb. Partial DNA sequences analysis was performed on random selected cDNA clones to assess the quality of cDNA library.

ESTs analysis

Clones of cDNA with insert size more than 700 bp were selected for sequencing from the 5′end of the cDNA fragments. Three thousand sixty-six high quality sequences with a minimum of 100 bp were achieved from 3,647 cDNA clones for further analysis after removing sequences representing ribosomal, mitochondrial, and vector sequences. Minimum, average, and maximum lengths of high quality sequences were 102, 501, and 773 bp, respectively. Using Phrap and Consed program, 3,066 ESTs were assembled into 1,471 unigenes with 392 contigs and 1,079 singletons. BlastX analysis of the high quality sequences revealed that 874 (59.4%) ESTs could be assigned a putative identity based on strong sequence similarity to proteins in the GenBank. Because nonnormalized primary cDNA libraries were used, the number of cDNA clones derived from mRNA could illuminate the expression level of the gene in the mycelial growth stage. The contigs containing ten or more ESTs were listed in Table 1. The most prevalent contig consist of 109 ESTs and had significant similarity to glyceraldehyde-3-phosphate dehydrogenase. The second most abundant contig (63 ESTs) was homologous to coproporphyrinogen oxidase. The third most abundant contig showed no similarity with any known genes in the database representing predicted protein. Glyceraldehyde-3-phosphate dehydrogenase, pyruvate kinase, pyruvate decarboxylase, and EF1-alpha translation elongation factor involved in protein translation, carbon metabolism, and energy production were also highly expressed which indicated that mycelium was in the process of metabolic fastigium. Other abundant ESTs encoded ATP synthase, alcohol dehydrogenase, ADP-ATP translocase implied energy metabolism was also active in the cell of C. cupreum. Of the unigenes, approximately 597 (40.6%) ESTs showed no or low similarity to protein sequences in the NCBI database according to BlastX algorithm. The isolation of expressed ESTs with no annotated function is also potentially an important find.

Table 1 Assembled contigs containing more than 10 ESTs

BlastN analysis revealed that 803 sequences had similarity to genes in the GenBank containing ESTs from other organisms. Comparative analysis of C. cupreum ESTs with fungi ESTs showed that 709 unigenes had significant similarity to fungal protein sequences. The comparison revealed that C. cupreum ESTs had 579 similar genes with T. harzianum. The abundant expressed genes in both C. cupreum and T. harzianum ESTs collections were cyclophilin (CB897191), short-chain dehydrogenase (CK906841), serine palmitoyl CoA transferase (CK907043), sterol 8-isomerase (CK909593), β-glucosidase (CK908283), and alcohol oxidase genes (CK908231). About 36 ESTs of C. cupreum had similarity with short-chain dehydrogenase of T. harzianum, 15 ESTs had similarity with sterol 8-isomerase, and 66 ESTs had similarity with hypothetical protein of T. harzianum (CK907495). The genes relative to biocontrol function expressed together in both C. cupreum and T. harzianum ESTs collections were class V chitinase, β-1,3-exoglucanase (CK909006), sterol 8-isomerase, and β-glucosidase gene. The comparison of C. cupreum ESTs to ESTs of C. Globosum using BlastN showed that 398 genes were homologous. The genes of cyclophilin (BP099197), related to nebula protein (BP113071), polyubiquitin, and three unnamed proteins (BP113269, BP113305, and BP113096) were highly abundant in both the EST collections.

Of the ESTs, 2,033 with similarity in BlastX analysis were organized into three parts according to GO classification involved in 60 different functions. More than half of ESTs (59.2%) showed similarity to genes involved in biological process and the most abundantly expressed ESTs associated with metabolism (22.7%); other highly expressed ESTs were associated with cellular physiological process (15.4%), regulation of physiological process (3.15%), response to stimulus (2.9%), and pigment (3.64%). The next most prevalent clones encoded proteins (29.33%) of C. cupreum were involved in molecular function including ligase activity (0.25%), transcription factor activity (0.4%), lyase activity (0.64%), and electron transporter activity (0.54%), especially clones with oxidoreductase activity (11%) and nucleic acid binding functions (2.9%) were overexpressed. Importantly, six ESTs encoded immunomodulatory protein which related to toxin activity (0.3%). Two hundred and thirty-six ESTs (11.6%) were grouped into cellular component parts including respiratory chain complex I (0.1%), unlocalized protein complex (0.05%), intracellular organelle (3.7%), and membrane (0.1%) categories. The diversity of gene expression implied the complexity of metabolism in C. cupreum mycelium.

Metabolism pathways analysis

ESTs that showed similarities to known proteins was classified into different metabolisms pathways (Table 2). The most represented pathway was glycolysis showing the important roles in cell metabolism. The second most represented category was porphyrin and chlorophyll metabolism; obviously, fungi cannot produce chlorophyll, but they have heme biosynthetic pathway. The main gene, found in the C. cupreum cDNA library relative to heme biosynthesis, was coproporphyrinogen oxidase, which was represented by 72 ESTs. It catalyzes the conversion of coproporphyrinogen-III to protoporphyrinogen-IX and is an essential enzyme in the heme biosynthetic pathway. Forty-seven ESTs (18%) belonged to the citrate cycle category with role in the energy metabolism. Thirty-four ESTs were associated with electron transport and oxidative phosphorylation which also played a key in cell energy metabolism. A lot of genes found in the cDNA library were connected with protein biosynthesis and degradation pathways including glutamate, methionine, cysteine, arginine and praline, histidine, tyrosine, and tryptophan metabolism; phenylalanine, valine, leucine, and isoleucine degradation; valine, leucine and isoleucine biosynthesis, tyrosine, and tryptophan biosynthesis. The proteolytic system found in these categories may play an important role in C. cupreum biocontrol ability by proteases taking part in the host cell wall breakdown of pathogens. Some pathways related to saccharide metabolism were also obtained, for example, galactose, fructose, and mannose, starch and sucrose, nucleotide sugars, aminosugars metabolism, glyprotein, and peptideprotein biosynthesis. We can then proceed to analyze the relationship between the assignment of a metabolic pathway for a protein and its structure, considered in terms of both its conformational fold and its evolutionary superfamily.

Table 2 KEGG biochemical pathway mappings of C. cupreum

Candidate genes associated with biocontrol function

The genes identified among mycelial growth reflected the biocontrol functions of C. cupreum. The expressed genes encoding biocontrol-associated proteins were related to cell-wall degradation, antifungal metabolite production, proteolytic function, production of enhancing plant resistance substance, and self-metabolism resistance (Table 3). The mycoparasitic process is presumed to involve various fungal cell wall degrading enzymes, including chitinase, glucanase, and glucosidase. Indeed, cDNAs encoding cell wall hydrolases were found in the library of C. cupreum. Of the 142 ESTs associated with biocontrol, 35 (24.65%) ESTs encoding cell wall degrading enzymes accounted for the highest percentage including β-1, 3-exoglucanase (DV546424, DV544842, DV544713, DV546012, DV544470, DV545432, DV544477, DV547403, DV544454, DV544484, DV545129, DV546364, DV545072, DV544562, DV545952, DV544979, DV546676, DV546546, DV546460, DV545542, DV544737, DV544416), endoglucanase IV (DV545380), β-glucosidase 5 (DV545481, DV545793, DV545126, DV546743, DV546476, DV548542), β-glucosidase (DV546469), β-glucosidase 6 (DV547247), and chitinase (DV546055, DV544732, DV544989, DV544376). Of them, ESTs similar to β-1,3-exoglucanase gene was found 22 times, ESTs similar to β-glucosidase was observed eight times and chitinase was observed four times. Chitinase is considered as potential biocontrol agents against fungi and plays a key role in antagonism of some plant pathogens. β-1,3-glucanases (exo or endo) and β-glucosidase is a major wall-lytic enzyme responsible for the dissolution of the β-1,3-glucan in host cell wall during mycoparasitism (Figs. 2 and 3). Proteases are also regarded as antifungal proteins secreted as part of the defense responses by the biocontrol fungus. The subtilisin-like serine protease (DV546459, DV546294, DV544423, DV546484) genes and aspartic proteinase genes (DV545776, DV547231, DV546784, DV546176, DV546884, DV544891, DV547261, DV547321, DV545627, DV544971, DV546366, DV546126, DV544706, DV545010, DV544844, DV545924, DV546967, DV545636, DV546174, DV546528, DV547357) also were identified in the cDNA library of C. cupreum. In our studies, 25 ESTs were identified related to proteolytic degradation including four ESTs from subtilisin-like serine protease and 21 ESTs from aspartic proteinase which accounted for about 19.72% in total biocontrol ESTs. It is also known that the ability of Chaetomium spp. to produce antifungal metabolites such as chetomin, rotiorinols A–C, multidrug resistance protein, (DV545494, DV544640, DV544681, DV546719, DV545349, DV547367), polyketide synthase (DV547431, DV546945), isopenicillin N synthase, and related dioxygenases genes (DV546194, DV546282, DV544663) were obtained in the cDNA library of C. cupreum. Terpene compounds are involved in the biocontrol process due to their antifungal properties. C-4 sterol methyl oxidase (DV544986, DV544993, DV545404, etc.), C-8 sterol isomerase (DV544900, DV547178), and sterol C-22 desaturase (DV546589) are three enzymes in the biosynthesis of one kinds of triterpene derivatives—ergosterol. The xylanase genes of C. cupreum were obtained which related to induction of plant resistance. Fungus itself has resistance ability to outside stress. DHA14-like major facilitator (DV544766, DV546717) can provide the tolerance of fungus to toxic compounds and two ESTs of C. cupreum encoding DHA14-like major facilitator genes were found. Nine ESTs encoding β-1,3-glucan binding protein (DV544687, DV545882, DV546061, DV546443, DV546757, DV546969, DV547432, DV548098, and DV548106) are involved in the recognition of invading microorganisms that acts to protect itself.

Table 3 The ESTs associated with biocontrol function of C. cupreum
Fig. 2
figure 2

The mycoparasitism of C. cupreum on P. capsici

Fig. 3
figure 3

The mycoparasitism of C. cupreum on R. solani

Discussion

EST analysis has been proven to be an efficient approach to identify genes expressed under a wide variety of conditions and systems about fungi. In vascular wilt pathogen Verticillium dahilia (Neumann and Dobinson 2003), more than 2,000 ESTs were generated from two cDNA libraries. In Mycosphaerella graminicola, 704 unigenes were identified from budding conidia cDNA library (Keon et al. 2000). A lot of emphases were placed on the identification of pathogenetic genes, however, work on the biocontrol fungus were poorly studied. Knowledge about the molecular aspects of biocontrol fungus to plant pathogens is critical for increasing resistance of biocontrol fungus and plant through genetic engineering. In this study, a cDNA library was constructed from C. cupreum mycelium; 3,066 ESTs assembled and 1,471 unigenes were generated. BlastX analysis showed that 59.4% (874) unigenes exhibited strong similarity to genes/ESTs in public databases. As expected, about half of genes showed high similarity to the special genes of fungi that reflected the fungal feature of C. cupreum. It included class V chitin synthase (DV546782) and glucan synthase (DV547356). The cell wall of fungus was mainly composed of glucan and chitin and the acquisition of genes related to fungus cell wall biosynthesis could offer clues for understanding mechanism of fungus cell wall biosynthesis and help to develop new methods for plant fungi disease control. Contig building reduced the total number of nucleotides used for further analysis and increased the length and quality of the sequence from 423 to 1,732 nucleotides. Eleven ESTs matched with heat shock protein 30 which is associated with cell resistance. Contigs had significant similarity to fungal genes in the GenBank and they encode proteins with interesting functions containing C-4 sterol methyl oxidase (contig433, 54 copies), aspartic protease (contig422, 16 copies), and β-1,3-exoglucanase (contig47, 14 copies), which may play role in the antagonisms of biocontrol fungus.

Biocontrol is a complex process including the release of hydrolase, production of antifungal substance, and further penetration into the host mycelium. The strong biodegradation and substrate colonization performances of C. cupreum are the result of metabolic versatility and high secretory potential that leads to the production of diversified sets of hydrolytic enzymes. Similarly, the direct attack of C. cupreum to phytopathogen is based on the secretion of complex cocktails of enzymes involved in the degradation and further penetration of the fungal host cell wall. The results showed that the EST approach is a mean of discovering potentially genes that may be involved in the biocontrol mechanism. Most studies have proven that chitinase (Bruce et al. 1995; Rachel and Ilan 1998) and β-1,3-exoglucanase (Chiu and Tzean 1995) are classical enzymes connected with fungus biocontrol. ESTs obtained from cDNA library of C. cupreum had high similarity to Endoglucanase IV of Hypocrea jecorina (3e-52). Chitinase can decompose pathogen cell wall which catalyzes the hydrolysis of chitin, an unbranched polymer of β-1,4-N-acetylglucosamine. Especially, it can destroy neonatal chitin on mycelia tip to inhibit pathogens growth, thereby generating oligosaccharides containing GlcNAC that may act as elicitors for inducing general antifungal response of biocontrol fungus (Susanne et al. 1999). Moreover, chitinase also plays an important physiological and ecological role in ecosystems as recyclers of chitin by generating carbon and nitrogen sources. Twenty-two ESTs is close to β-1,3-exoglucanase from Neurospora crassa (3e-44) which can break down β-1,3-glucan of pathogen cell wall, and experiments showed that it could inhibit mycelial growth of fungus obviously. More importantly, the oligosaccharide that was released from pathogen cell wall also can induce resistance of host plant to pathogens. Owing to there is β-1,3-glucan and lipoid layer outer chitin at fungus mycelium tip, the cooperation of chitinase and β-1,3-glucanase has stronger inhibitory effects than any of them. β-glycosidase is also strongly antifungal against a wide range of plant pathogens by decompose cell wall of fungus and it is synergistic with other cell wall degradation enzymes (Giuliano et al. 2001). Proteases can take part in the host cell wall breakdown course or act as proteolytic inactivators of pathogen enzymes (Kapteyn et al. 1996; Elad and Kapat 1999). Protease has biocontrol function including aspartic proteinase, subtilisin (Marıa et al. 2004; Yang et al. 2005), and trypsin-like proteases. Elicitation of plant defense response by a 18-kDa protein from T. virens with similarity to serine proteases has been recently described (Hanson and Howell 2004). Aspartic proteinase related to mycoparasitic and plant root colonization activities also expressed in T. asperellum T-203 (Ada et al. 2004). The Aspartic proteinase of T. harzinum induced by fungal cell wall suggested that it might participate in early stages of the mycoparasitic process (Suarez et al. 2005). The acquisition of aspartic protease gene from C. cupreum provided base for further study of its function during biocontrol. Polyketide synthase is involved in biosynthesis of diverse carbon skeletons from simple activated carboxylic acid units. The products of the complex pathways possess a wide range of pharmaceutical properties, including antibiotic, antifungal, and immunosuppressive activities. For example, it plays a key role in cercosporin biosynthesis by Cercospora nicotianae and the antibiotic erythromycin biosynthesis by Saccharopolyspora erythraea. Three ESTs were obtained for encoding isopenicillin N synthase and related dioxygenases which were the key enzymes of penicillin and cephalosporins biosynthesis. However, whether C. cupreum can produce cephalosporins remained to be determined. Some biocontrol fungus could improve plant host growth and induce plant resistance to a variety of plant pathogens. Xylanase could induce plant resistance by opening K+, H+, and Ca2+ ion channels, synthesizing PR protein and ethylene (Dean et al. 1989). DHA14-like major facilitator is a member of a novel major facilitator superfamily from Botrytis cinerea and provides tolerance to toxic compounds, such as camptothecin and cercoaporin, as well as certain fungicides (Fleissner et al. 2002); two ESTs of C. cupreum encoding DHA14-like major facilitator genes were found. β-1,3-glucan binding protein is involved in the recognition of invading microorganisms which binds specifically to β-1,3-glucan and lipoteichoic acid and causes aggregation of invading microorganism that acts to protect the cell of organism.

This is the first report of the use of high-throughput EST analysis to examine the genes expression in C. cupreum mycelium. The EST collection and its annotation provide a significant resource for fundamental and applied research of C. cupreum. Abundant expressed clones could be determined and their expression patterns and physiological significances are examined in more detail. Especially, with ESTs approach, 142 ESTs biocontrol-associated sequences were identified. The ESTs analysis exemplifies a fertile approach for the identification genes expressed during mycelium development in C. cupreum and investigation genes with biocontrol for further elucidation the integrated mechanism against fungal pathogens at the molecular level. The biocontrol genes are a valuable resource for further research. In the future, on the one hand, the mycelial cDNA library of C. cupreum will be constructed with plant fungus pathogens cell wall or chitin as inducement. On the other hand, the ESTs biocontrol-associated sequences will be cloned for the transform into the biocontrol fungi or plant for further applications on agriculture.