Introduction

Sugarcane is an important industrial crop of tropical and subtropical regions. It belongs to the genus Saccharum L., of the tribe Andropogoneae in the family Poaceae. Cultivated sugarcane varieties have complex polyploid and aneuploid genome structures that are derived from interspecific hybridization. Sugarcane cultivars of pre-hybrid era were clones of Saccharum officinarum (noble cane, 2n = 80, octoploid), a species having thick stalks with high sugar content. Early sugarcane breeders made a breakthrough in increasing production and disease resistance by crossing S. officinarum with Saccharum spontaneum (2n = 40 − 128), a low sugar wild species.

In spite of its immense economic importance, sugarcane genetics has received relatively little attention as compared to other crops, mainly due to its highly heterozygous, polyploid and frequently aneuploid nature, complex genome, poor fertility, and the long breeding/selection cycle. However, the sugarcane genome is beginning to be unraveled by genetic mapping using molecular markers (Hoarau et al. 2001; Raboin et al. 2006) as well as comparisons with closely related diploid genomes such as sorghum (Dufour et al. 1997; Ming et al. 1998). Agronomically important traits, like disease resistance (Daugrois et al. 1996; Rossi et al. 2003) and sucrose yield (Ming et al. 2001), have been studied in sugarcane but complexity of the genome has impeded identification of candidate genes. Map-based cloning strategy is also being utilized for the cloning of a fungal resistance gene (D’Hont and Glaszmann 2001). Only a limited number of genes related to abiotic and biotic stresses have been reported so far in sugarcane. To identify and study most of an organism’s genes/gene families and their alleles in different genotypes, analysis of expressed sequence tag (EST) library is proving fruitful. Sequencing and analysis of ESTs have been an efficient approach for identifying a large number of genes expressed during different developmental stages or in response to a variety of environmental conditions in plants such as apple (Newcomb et al. 2006), cocoa (Argout et al. 2008), corn (Alexandrov et al. 2009), coffee (Lin et al. 2005), pepper (Kim et al. 2008), sugarcane (Vettore et al. 2003; Ma et al. 2004), soybean (Umezawa et al. 2008), and spruce (Ralph et al. 2008). This approach has also served as a resource for functional genomics of abiotic stress in plants (Rensink et al. 2005; Houde et al. 2006). The availability of sequences and analysis tools facilitate broad-based utilization of ESTs for gene structure annotation and comparative genomics (Ma et al. 2004). Through the SUCEST project in Brazil, a collection of ~240,000 ESTs, generated from 26 tissue and developmental stage-specific cDNA libraries, is available in sugarcane (Vettore et al. 2001, 2003). These sugarcane ESTs have aided in the identification and transcription profiling of signal transduction-related genes in sugarcane tissues (Papini-Terzi et al. 2005), molecular chaperon-related sequences in the sugarcane expressome (Borges et al. 2007), low-temperature-responsive genes (Nogueira et al. 2003), and ABA- and MeJA-activated sugarcane bZIP genes (Schlögl et al. 2008).

There is a great challenge to sugarcane by abiotic stress like drought and biotic stress such as red-rot disease, which adversely affects cane productivity. Present work has further enriched the sugarcane EST data by generating new ESTs from Indian sugarcane cultivars including ESTs generated from the red-rot infected stem of sugarcane cultivar, Co 1148.

Materials and methods

Plant material

A well-adapted popular sugarcane variety of subtropical India, CoS 767, was used for making the general cDNA libraries from root, stem (stalk), leaf whorl, and mature leaf tissue of field-grown juvenile or mature sugarcane plant. Similarly, stem (stalk) tissue samples from normal and 3-day post-inoculation with Colletotrichum falcatum (red rot) pathotype Cf 01, in another popular sugarcane variety of subtropical India, Co 1148 (now susceptible to red rot), were used for making subtractive cDNA library. To analyze the expression of red-rot-related genes, 3-day post-C. falcatum Cf 01-inoculated and uninfected sugarcane stem of Co 1148 (susceptible variety) and BO 91 (tolerant variety) were used. Inoculation of C. falcatum Cf 01 was carried out using “plug method” (Duttamajumder and Dwivedi 1995) and “parafilm method” (Duttamajumder and Mishra 2004). For expression analysis of water-deficit stress-related genes, sugarcane (CoS 767) leaves were harvested from the field-grown plants and treated with 400 mM d-mannitol solution for 3, 6, and 9 h before RNA isolation. Sterile milli Q water was used for treatment of the leaves used as control. The dehydration stress was given under continuous light in culture room at standard conditions (25 ± 1°C under 16 h/8 h light/dark cycle) and illuminated at 50–100 μmol quanta m−2 s−1.

RNA extraction and mRNA purification

Total RNA was isolated from various tissues using acid guanidinium thiocyanate–phenol–chloroform extraction method as described by Chomczynski and Sacchi (1987) followed by treatment with RNase free DNase I (Roche Molecular Biochemicals, Germany). Messenger RNA was purified from total RNA by using PolyATtract mRNA Isolation System III (Promega Corporation, Madison, WI, USA).

Construction of cDNA libraries and DNA sequencing

To construct general cDNA libraries of root, leaf whorl, and stem, Creator SMART cDNA Library construction kit (BD Biosciences Clontech, USA) was used. Purified poly A+ RNA was taken as starting material for synthesizing cDNA by primer extension and LD-PCR method, respectively. General library of mature leaf tissue was constructed using Lambda ZAP Express cDNA Synthesis Kit and ZAP Express cDNA Gigapack III Gold Cloning Kit (Stratagene, USA). Subtracted cDNA library from 3-day post-C. falcatum-inoculated sugarcane stem was constructed using CLONTEC PCR-Select cDNA Subtraction Kit (BD Biosciences Clontech, USA). The colonies from all the libraries were picked and arrayed in 96-well microtiter plates and stocked at −80°C. Plasmids for sequencing were isolated from the bacterial cultures using the QIAprep Turbo Miniprep Kit (QIAGEN, Germany) by employing Genesis Workstation 2000 (TECAN, Switzerland) robotic system. Sequencing reactions were performed with 200 ng of plasmid DNA using the ABI Prism BigDye Terminator Sequencing Kit (Applied Biosystems, USA) and analyzed on ABI3700 automated DNA analyzer (Applied Biosystems, USA).

Sequence processing and clustering

Primary processing of the in-house-generated EST sequences was carried out with the basecaller PHRED (Ewing et al. 1998), for basecalling and quality trimming. Screening of vector/adapter sequences was carried out with the help of Cross_match (www.phrap.org) software. EST sequences having PHRED score >20 and longer than 299 bp were submitted to the GenBank. However, for the purpose of clustering, size exclusion was not done. Clustering was carried out with the help of CAP3 program (Huang and Madan 1999) set with default parameters.

Analysis of EST clusters and their expression

BLASTX analysis was carried out with EST clusters against the “nr” protein database at NCBI to identify any cluster similar to known stress- or disease-related genes. Real-time PCR was carried out on ABI Prism 7000 Sequence Detection System and its related software (PE Applied Biosystems, USA). Primers for qRT-PCR analysis were designed using Primer Express 2.0 (PE Applied Biosystems, USA). First-strand synthesis was carried out with 5 μg of total RNA using High Capacity cDNA Archive kit (Applied Biosystems, USA). At least two different RNA preparations and cDNA syntheses were used, and each cDNA sample was subjected to real-time PCR in triplicate. The relative mRNA levels were computed with respect to the internal standard actin gene of sugarcane to normalize for variance in the quality and amount of cDNA. The values presented here are the mean of the two biological replicates, each with three technical replicates. The error bars indicate the standard error from the mean.

Results and discussion

Sequencing, clustering, and annotation of sugarcane ESTs

A total of 34,457 clones were sequenced from 5′ end to generate EST sequence data using various tissue-specific cDNA libraries (root, stem, and leaf tissue) of an Indian subtropical variety of sugarcane, CoS 767. Basecaller PHRED was used to extract good quality sequences with an average basecalling score of more than 20. In all, 25,382 (74%) sequences were found to be of good quality (size > 299 bp) and submitted to the GenBank. In order to remove redundancy, all the sequenced ESTs (34,457) were assembled using CAP3 program. In order to generate a non-redundant dataset of sugarcane ESTs, the entire sugarcane EST collection comprising of 2,55,716 sequences available publicly at GenBank was downloaded and clustered using the CAP3 software and then combined with the in-house generated data. Finally, a non-redundant dataset for sugarcane ESTs was generated, which consisted of 61,966 unigenes (2,861 contigs and 59,105 singlets). Out of these, 4,002 are being reported for the first time.

To identify red-rot-related genes and further enrich the sugarcane ESTs, a subtractive cDNA library was also prepared from the stem of sugarcane variety Co 1148 challenged with C. falcatum. The focus on red-rot disease reflects the importance of the disease—no variety in India is released unless it possesses resistance against the disease. In other words, sugarcane breeding is geared around red-rot resistance (Chona 1980; Satyavir 2003; Duttamajumder 2008). A total of 1,440 clones were randomly sequenced to generate ESTs, of which 1,069 good quality sequences (average PHRED score > 20 and more than 299 bp in length) were submitted to the GenBank. In order to identify EST clusters unique to the biotic stress (red rot of sugarcane), these sequences were clustered separately. As a result, 102 clusters were generated by assembling all the 1,440 EST sequences. Comparison of these clusters with the 61,966 previously generated EST clusters revealed that only 17 red-rot cDNA library clusters were >95% similar to the main dataset, while only one cluster was 100% identical. Thus, at least 85 (83%) red-rot-specific clusters were unique and not reported in the database previously.

In order to assign putative functions to the non-redundant sugarcane EST clusters and to identify the abiotic/biotic stress-related candidates, the dataset was compared with the UniProt Knowledgebase (UniProtKB) as well as nr protein database of NCBI using BLASTX. A similar analysis with the EST clusters from the red-rot-challenged tissue cDNA library led to putative functional assignment to 58 (57%) of these clusters (Supplementary Table 1). These included sequences categorized in a range of functional categories, such as transcription, basic metabolism, membrane transport, and signal transduction, as well as a few hypothetical proteins. Among the signal transduction components were calmodulin-binding proteins, receptor protein kinases, and auxin-responsive genes. Several clusters showed significant homology to the pathogen-induced proteins like peroxidases, remorin-like proteins, and endoplasmin homologs.

Expression of water-deficit stress-/red-rot-related genes

Based on the information generated by functional annotation of EST clusters, efforts were made to identify clusters that may play a critical role during biotic/abiotic stress. ESTs are invaluable resource for identification of genes against various stresses (Chen et al. 2008; Gao et al. 2008) and those conferring disease resistance (Shi and Wang 2008). On the basis of functional annotation, 37 water-deficit stress-related and 22 red-rot-related clusters (15 from general cDNA libraries while seven from red-rot-subtracted cDNA library) were selected for expression analysis (Table 1).

Table 1 Sugarcane clusters showing differential expression by real-time RT-PCR analysis

Out of the selected 37, 25 water-deficit stress-related clusters showed greater than 2-fold relative expression during dehydration stress (Fig. 1a). For most of the water-deficit stress-related clusters, significantly higher expression levels were observed during 9 h of dehydration. All clusters (stress cluster 8, stress cluster 9, stress cluster 17, stress cluster 19, and stress cluster 22), which showed homology to osmotic stress or dehydration stress-associated proteins, were significantly up-regulated. A very high (~13-folds) induction during 9 h of dehydration stress was noted in the case of stress cluster 13 (a salt-stress-inducible bZIP protein; Fig. 1a). These might be delayed action genes working in the related pathways. Some of the clusters like stress cluster 1 (putative oxidative-stress responsive) and stress cluster 15 (70 kDa heat shock protein) showed up-regulation during 3 to 6 h of dehydration stress but later their expression decreased (Fig. 1a). These genes might have some role in providing tolerance in the initial stages of stress. Some clusters (stress clusters 2, 11, 12, 18, etc.) showed higher expression during 3 h, and then a decline during 6 h and again expressed to significantly higher levels at later stages (9 h; Fig. 1a). This pattern of expression and delayed response of certain genes can be attributed to their tight regulation via other candidates involved in the pathway. ABA-dependent and ABA-independent pathways are involved in stress signal transduction. Drought stress increases ABA level in plants, which in turn is involved in the signal transduction of gene expression conferring adaptation (Shinozaki and Yamaguchi-Shinozaki 1997). Plants also use varying levels of osmoprotectants (betaines, polyols, sugars, and amino acids such as proline) during stress to improve their tolerance to drought, salinity, and other stresses (McNeil et al. 1999). Zhang et al. (2006) showed that transgenic plants transformed with Grifola frondosa trehalose synthase (TSase) gene can accumulate high levels of trehalose and have enhanced tolerance to drought. Some of the sugarcane genes studied here might be involved in the pathways that lead to the production of such osmoprotectants. Molecular chaperones or heat shock proteins are also responsive to various environmental stresses. Over-expression of heat shock protein genes contribute to drought-stress tolerance in plants (Sun et al. 2001; Cho and Hong 2006). A dehydrin gene OsDhn1 in rice has been shown to be strongly induced by drought stress (Lee et al. 2005). Certain heat shock protein and dehydrin genes were found to be significantly induced by drought stress in sugarcane as evident from the results.

Fig. 1
figure 1

a Kinetics of induction of stress-related sugarcane EST sequences (genes) in response to water-deficit stress. The transcript levels of each stress-related gene in mature leaf treated with 400 mM d-mannitol for 3, 6, and 9 h were plotted as the relative expression (fold) of the leaves mock-treated for the same duration. b, c Expression profiles of various red rot related genes in response to Colletotrichum falcatum Cf 01 (red-rot) challenge. The transcript levels of each gene at 3-day post-inoculated sugarcane stem (mature) of susceptible (Co 1148) and resistant (BO 91) variety were plotted as the relative expression (fold) of the mock-treated stem for the same duration. Expression of sugarcane genes that did not express in tolerant variety (b) or were up-regulated in tolerant variety (c) are shown. Stress cluster 41–47 are derived from the subtractive cDNA library from red-rot infected sugarcane stem.

Plants and animals respond to various viral, bacterial, and fungal pathogenic agents via disease-resistance pathways operating in them. The majority of the disease-resistance genes (also known as R genes) in plants encode nucleotide binding site leucine-rich repeat (NBS-LRR) proteins (McDowell and Woffenden 2003; Belkhadir et al. 2004; Leister 2004). In the present study, expression of 22 disease-related genes in susceptible and resistant varieties of sugarcane during red-rot challenge were monitored and compared. Three clusters, stress cluster 26, stress cluster 32, and stress cluster 36 (that showed similarity to NBS-LRR type disease-resistance protein, a putative disease-resistance RPM1 protein, and stem rust resistance protein, respectively), from general cDNA libraries were found to be down-regulated in susceptible variety, while their expression was absent in the resistant variety (Fig. 1b). Probably, they act as repressors or negative regulators controlling the expression of certain other genes in the pathway associated with disease resistance. For example, ERD15 (early responsive to dehydration 15) is a negative regulator of abscisic acid responses in Arabidopsis, and silencing of ERD15 showed improved tolerance to drought and freezing as well as improved disease resistance (Kariola et al. 2006). Expression of some of the sugarcane clusters/ESTs was found to be slightly up-regulated in resistant variety (BO 91), while the same were down-regulated in the susceptible variety (Co 1148) with stress clusters 44, 45, and 47 showing substantial differences between the two genotypes (Fig. 1c). Most of the disease-related genes analyzed here showed down-regulation in the resistant variety, the more pronounced ones being stress clusters 31, 33, 35, and 38 (Supplementary Figure 1). Sugarcane ESTs that were up-regulated in the resistant variety suggests that they act as positive regulators in the defense response. An entire set of candidate disease-resistance (R) genes are now being identified and studied at the genome-wide level (Zhou et al. 2004; The Rice Chromosomes 11 and 12 Sequencing Consortia 2005; Yang et al. 2006). An EST cluster, stress cluster 38 (a disease-resistance-related protein), which belonged to general cDNA library, showed induction (>1.5-fold) in the susceptible sugarcane variety, suggestive of a good disease-associated candidate for further study. Plant disease-resistance genes identified in our EST pool that showed differential regulation can provide preferred targets for breeding or to engineer durable disease resistance in sugarcane. Analyses of these gene functions would help better understand the physiological mechanisms underlying the response of sugarcane plant to various biotic and abiotic stresses.