Introduction

Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis L.) are oilseed plant species that belong to family Euphorbiaceae. Both are economically important as sources of oil, which is used for the production of various industrial products like lubricants, cosmetics, medicines, including high quality biodiesel due to the presence of large proportion of unsaturated fatty acids. Off late the large-scale cultivation of selected genotypes of both the species have made them vulnerable to biotic stresses including diseases and pests, thereby affecting their oil yield potential [1, 2]. In the recent past, Jatropha curcas mosaic disease (JcMD) has been found to be prevalent in the plantations and is continuously reducing fruit yield and quality of Jatropha plants in the field [35]. Similarly fungal strains of Alternaria alternate, Neoscytalidium dimidiatum, Botryosphaeria dothidea and Colletotrichum gloeosporioides were reported to be responsible for infectious spots, root rot, black rot and anthracnose disease, respectively causing reduction in overall yield of Jatropha [69]. The viruses are also prevalent in Castor bean and reduce overall yield and quality [1012]. Other pathogens like fungi and bacteria are also responsible for causing diseases in Castor bean. In Tanzania, fungal complex including Alternaria and Fusarium species were reported to be responsible for infection of inflorescence and capsules of Castor bean [13]. Diseases like fungal disease (Verticillium spp.) and bacterial disease (Wilt) are prevalent in Castor bean [1416]. Management of various abiotic stresses in Jatropha and Castor bean is not economically viable through pesticides. Therefore, selection and development of disease resistant genotypes would be a sustainable strategy. However, no systematic breeding programme for disease resistance has been initiated as of today, both in Jatropha and Castor bean.

Plants have acquired resistance to many pathogens and pests due to the presence of disease resistance (R) genes that encode proteins which protect them from pathogenic organisms [17]. The research in the recent past on R-genes and downstream signal transduction mechanism has provided a strong base, which pave the way for their use in disease control [18, 19]. The bulk of R-genes in plants are from nucleotide binding site-leucine rich repeat (NBS-LRR) class, providing resistance to a large number of pathogens including parasites, fungi, bacteria, oomycetes, insects, and viruses [2024]. NBS proteins are further classified into two sub categories based primarily on domains and motifs. Those having N-terminal domain with resemblance to the Toll and interleukin-1 receptors are designated as TIR proteins, and those without a TIR domain are categorized as non-TIR proteins [25]. A few of non-TIR proteins encode an N-terminal coiled-coil (CC) domain that may be involved in signaling and interaction of proteins [24, 26]. The NBS domains linked with both TIR and non-TIR proteins consist of a P-loop (Kinase-1), Kinase-2, Kinase-3, and some additional short motifs of unknown role [27]. The NBS domain functions by binding ATP [28], and the C-terminal leucine rich repeat (LRR) is implicated in pathogen binding and regulation of signal transduction [24, 25]. TIR domains are also involved in resistance specificity determination and signaling [25, 29].

All angiosperms have NBS-LRR encoding genes with differences between monocots and dicots. Although many NBS-encoding genes have been identified in A. thaliana that code for TIR domains [30], this subclass remain missing in cereal species [31, 32]. This finding suggests that since divergence which occurred >200 million years ago [33], TIR domain association with NBS-encoding genes was conserved in dicots but lost in monocot species.

The regulation of immunity and response to other stresses of plants in their natural habitat is enforced by a netwok of regulatory proteins or transcription factors which are considered as potential targets for engineering plant defense [34]. Transcription factors normally bind to the promoters of resistance genes and thus regulate their expression [35]. Many of the defense or disease resistance related transcription factors have been studied recently including the TGA family of basic domain-leucine zipper (bZIP) proteins [36, 37], the MYB proteins [38], the ethylene responsive element binding factors [ERFs, having a DNA binding domain also reside in the APETALA2 (AP2) protein family], the WRKYs [39], and the Whirly family [34].

Identification and characterization of disease resistance genes, including NBS-LRRs is anticipated to accelerate the process of genetic improvement programmes and breeding for development of disease resistant varieties [40]. In the recent past, many resistance genes, including NBS LRR genes have been employed to produce genetically modified and transgenic disease resistant varieties. In case of Tobacco, N gene encoding TIR-NBS-LRR was transferred to develop transgenic lines which showed resistance to the mosaic virus [41]. Similarly, transgenic tobacco lines were developed using common bean TIR-NBS-LRR gene, RT4-4 exhibiting resistance towards mosaic virus from tomato or pepper [42]. In tomato, Bs2 gene encoding NBS-LRR protein has been transferred to develop resistance against bacterial spot disease [43]. Another gene responsible for bacterial blight resistance, Xa21 was introduced into Chinese rice varieties and the transgenic plants exhibited resistance to bacterial blight [44]. In case of wheat, the Pm3b gene has been introgressed which showed resistance against powdery mildew [45]. In another important study, RPS4 and RRS1, two NBS-LRR type R genes exhibited resistance to members of Brassicaceae and Solanaceae by providing immunity against various bacterial and fungal pathogens [46]. All these studies suggest that NBS-LRR genes hold potential in the development of disease resistant transgenics. There is no information as of today on identification of disease resistance genes (NBS LRR) or transcription factors regulating defense response in Jatropha and Castor bean, however, recent sequencing of their genomes and availability of transcriptomes [47, 48] have opened up avenues for detailed analysis of disease resistance genes, especially NBS-LRR genes and the transcription factors regulating defense response.

Whole genome-wide investigation of NBS-LRR resistance genes and transcription factors in a plant genome can therefore, provide novel insights about the overall resistance architecture. NBS-LRR genes number vary in different plant species, irrespective to the genome size. For example Arabidopsis genome (125 Mb) contains 149 genes [49], Rice genome (420 Mb) contains 535 genes [32], Potato genome (840 Mb) has 438 NBS-LRR genes [50], Soybean genome (1,115 Mb) comprise 319 NBS-LRR genes [51], Populus genome (500 Mb) having about 400 NBS-LRR genes [52] and Cucumber genome (350 Mb) contains 57 NBS-LRR genes [53].

Whole genome analysis of defense-related transcription factors have been done in many plant species. In Chinese cabbage genome, 291 putative AP2/ERF transcription factor regulating resistance against disease and biotic stresses were identified [54]. In the model plant Arabidopsis, 118 transcription factors of families APETALA2/ethylene responsive element binding proteins, MYB domain-containing proteins, C2H2 zinc finger proteins and WRKY domain showing response to defense elicitor, Chitin were identified using Affymetrix Arabidopsis whole-genome array [55]. In another example, Soybean, an important crop species, biotic stress response related trihelix-GT and bHLH transcription factors were identified and characterised using in silico approach [56]. No transcription factors specific to defense response have been identified in Jatropha and Castor bean till date. Comparative studies have suggested that plant genomes encode hundred of NBS-LRR genes, but a vast diversity in the total number and distribution of NBS-LRR genes and subclasses is there [23]. Guo et al. [23] compared the NBS-LRR gene complement of Arabidopsis thaliana and its relative Arabidopsis lyrata with both the species containing similar numbers of NBS-LRR genes. Plocik et al. [57] compared the NBS domain sequences of NBS-LRR resistance genes from Helianthus annuus (sunflower), Cichorium intybus (chicory) and Lactuca sativa (lettuce), suggesting that Asteraceae species having different R-genes families, comprised of genes related to both toll-interleukin-receptor homology (TIR) and coiled coil (CC) domain containing NBS-LRR resistance genes. Between two closely related species, chicory and lettuce, CC subfamily composition similarity was identified, while sunflower showed less resemblance in structure. A genome-wide comparative analysis of NBS-LRR genes in Sorghum bicolor and Oryza sativa revealed species-specific expansion of NBS-LRR genes that may directly explain variations in disease susceptibility of the corresponding species [58]. In another recent genome wide comparative study, analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana was performed which identified 157, 206 and 167 NBS-encoding genes, respectively and provided deep understanding to the evolutionary history of NBS-encoding genes after deviation of A. thaliana and the Brassica lineage [59].

Jatropha and Castor bean belong to the same family, Euphorbiaceae, however detailed analysis of NBS-LRR genes w.r.t. their domain architecture, expression analysis, comparative number, genome organization, phylogenetic relationship and status of transcription factors regulating resistant genes is completely lacking as of today. Therefore, the whole genomes and transcriptomes of both the species were analyzed and compared to identify whole complement of NBS-LRR genes, their genome location, characterization into Toll/interleukin-1 receptor NBS-LRRs (TNLs) or coiled-coil NBS-LRRs (CNLs), transcription factors specific to defense response or disease resistance.

Materials and methods

Data collection

The transcriptome data of Jatropha and Castor bean were downloaded from Sequence Read Archive (SRA) module of NCBI with accession nos. SRR087417 and ERA047687 respectively. The whole genomes of Jatropha and Castor bean were downloaded (ftp://ftp.kazusa.or.jp/pub/jatropha/;http://castorbean.jcvi.org/downloads.php). Velvet software [60] was downloaded from (http://www.ebi.ac.uk/~zerbino/velvet/) for assembly of transcriptome SRA files (NGS data). For similarity search, all the available NBS-LRR mRNA sequences were downloaded from the GenBank module of NCBI. Perl program, pfam_scan.pl and Pfam library of hidden Markov models (HMMs) of protein families were retrieved from Pfam website (http://pfam.janelia.org/) for domains prediction in protein sequences translated from transcripts.

Identification of Pfam domains/families associated with NBS-LRR genes and transcription factors related to disease resistance

The domains/families associated with NBS region were considered in the study due to the conserved nature of NBS region. Pfam keyword search with ‘NBS-LRR’ and associated key words (Table 1) was performed in the Pfam database (Supplementary Fig. 1). All hits of domains/families from the keywords were manually checked for their role in plant defense response and included in ‘Master list 1’ (NBS-LRR). Transcription factors important in disease resistance were retrieved from the literature. It was found that 10 different transcription factor families were involved in disease resistance (Table 2). All 10 transcription factor (TF) families were searched in Pfam text search to find out domains associated with each family. All hits of domains/families from the keyword were checked for their role as transcription factors for plant defense response. Only those domains were incorporated in the ‘Master list 2 (transcription factors)’ which had significant functional role as TF in plant defense response.

Table 1 NBS-LRR domains and their respective Pfam Ids
Table 2 Transcription factors involved in plant disease resistance

Identification of NBS-LRR genes and defense response associated transcription factors

The NBS-LRR mRNA sequences collected previously from NCBI were mapped on to the transcriptomes of Jatropha and Castor bean using BLAST in order to identify all NBS-LRR containing transcripts in the transcriptome other than predicted 91 NBS-LRR genes in Jatropha [48] and 121 in Castor bean [47]. The manually adopted Pfam IDs (domains/families) associated with NBS regions and transcription factors were also mapped using Pfam domain/family search against the transcriptomes of both Jatropha and Castor bean in order to identify all genes and transcription factors having the domain IDs from the NBS regions and families of transcription factors, respectively (Supplementary Fig. 1).

Domain architecture of a protein can be explored through searching the sequence against the Pfam library of HMMs [61]. NBS-LRR genes and the transcription factors were identified according to domain architecture. All transcripts and transcription factors of Jatropha and Castor beans were translated into proteins (using canonical codon table) according to reading frames and then proteins were subjected to Pfam domain/family search to find out presence of domains. Finally, proteins matching with Pfam domains/families listed in the ‘Master list 1’ (refer section: Pfam NBS LRR domain identification) were selected as NBS proteins and corresponding transcript as NBS transcript. The proteins showing match to Pfam domains/families listed in the ‘Master list 2’ (refer section: Pfam NBS LRR domain identification) were selected as transcription factors associated with disease resistance. In-house PERL programs were used to translate transcripts to proteins, Pfam domains prediction in translated proteins and comparison of predicted domains for their presence in Master list. Finally results were cross-checked manually (Table 3).

Table 3 Comparative distribution of NBS-LRR disease resistance genes between Castor bean and Jatropha genomes

Location of NBS-LRR genes in sequence contigs

All predicted NBS-LRR genes of Jatropha and Castor bean were mapped to genome sequence contigs. BLAST search was used to map contigs on whole genomes with exact matching cut off. Position of contigs on genome was extracted from BLAST alignment output file. All analysis, BLAST search and extraction of contigs location were done through in-house developed PERL programs.

Identification of common and unique NBS-LRR genes and transcription factors in Jatropha and Castor bean genomes

To identify common and unique NBS-LRR genes and defense response related transcription factors between Jatropha and Castor bean, all predicted genes and transcription factors from both the species were used in similarity search. BLASTN was used for finding similarity among contigs of Jatropha and Castor bean with cut off values of equal to or more than 70 % within at least a length of 100 nucleotides (Supplementary Fig. 1). In-house PERL program was used to perform BLASTN and to extract results within mentioned cut off, further results were also cross checked manually.

Expression analysis of identified NBS-LRR genes and transcription factors

Transcript abundancy/quantification was carried out using RSEM [62]. RSEM is an user oriented software for quantification of transcript abundances from RNASeq Data. RSEM calculates abundance estimates and posterior mean estimates and 95 % credibility intervals for genes/isoforms. There are two measures which specify abundance estimates, one gives an estimate of the fragments number that can be derived from an isoform or gene [the expected counts (EC)], and the other is the probable part of transcripts within the sample represented by the specified isoform or gene.

The expression profiles were obtained through pme_TPM (pme: Posterior mean estimates; TPM: transcripts per million) values. The TPM value is considered best over other metrics such as FPKM (Fragments Per Kilobase of transcript per million mapped reads) [63] and RPKM (reads per kilobase per million) [64] as it is not dependent on the mean expressed transcript length and so more comparable among diverse species and samples [62]. The transcript abundance of the contigs from the transcriptomes of Jatropha and Castor bean were calculated using pme_TPM parameter of RSEM package. All the parameters were kept default in the query option.

Identification of CNLs and TNLs in predicted NBS-LRR genes

Using PCOILS (http://toolkit.tuebingen.mpg.de/pcoils), the predicted NBS-LRR genes were further characterized into CNLs and TNLs with default parameters (Supplementary Fig. 1). PCOILS compares a sequence to previously identified parallel two-stranded coiled-coils and determines a similarity score.

Retrieval of disease resistance gene sequences of Castor bean and Jatropha

The NBS-LRR gene sequences of Castor bean were subjected to BLASTN against the Castor bean Database (http://blast.jcvi.org/erblast/index.cgi?project=rca1) and the database available at the NCBI (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Similarly, the gene sequences of Jatropha were used as query against the Jatropha Genome Database where they were subjected to BLASTN (http://www.kazusa.or.jp/jatropha/cgi-bin/blast.cgi). This analysis was performed to predict position of these genes in respective sequence contigs. The protein sequences of both these plant genomes were also subjected to similar analysis using BLASTP.

Protein characterization, motif distribution and domain prediction

By using PCOILS, the predicted disease resistance proteins were characterized into CNLs and TNLs (http://toolkit.tuebingen.mpg.de/pcoils). The distributions of motifs in these proteins were predicted using MAST (http://meme.sdsc.edu/meme/cgi-bin/mast.cgi). Protein function domains of disease resistance genes were predicted using the NCBI Conserved Domain search (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?INPUT_TYPE=live&SEQUENCE) and the HMM search using Pfam (http://pfam.jouy.inra.fr/hmmsearch.shtml).

Results

Identification of NBS-LRR genes and defense response associated transcription factors in transcriptomes of Jatropha and Castor bean

In case of Jatropha, 45 potential NBS-LRR genes were identified by mapping Pfam domains and 7 by mapping NBS-LRR mRNA sequences with BLAST analysis out of which 5 showed common identity to both Pfam domain mapping and BLAST analysis (mRNA sequence mapping) (Supplementary file 1). Similarly in case of Castor bean, 47 potential NBS-LRR genes were identified where 44 genes were identified using mapping Pfam domains and 13 using mapping publicly available NBS-LRR mRNA sequences supported by BLAST analysis out of which 10 genes showed identity to both Pfam domain matching and BLAST hits (mRNA sequence mapping) (Supplementary file 2). A total number of 47 new NBS-LRR genes were identified in each Jatropha and Castor bean genomes in addition to previously identified NBS-LRR genes. All these newly identified NBS-LRR genes were confirmed through similarity search with previously reported NBS-LRR genes (Sato et al., 2011; Chan et al., 2010) for their uniqueness. Similarity search (70 % identity) provided that all identified NBS-LRR were new and not reported earlier. Similarly when Pfam domains specific to the Transcription factors involved in disease resistance or defense response were mapped on to the transcriptomes of both the species, 122 and 318 transcription factors were identified in Jatropha and Castor bean, respectively.

Location of NBS-LRR genes in genome sequence contigs

The identified NBS-LRR genes were mapped on to the respective genomes using BLAST search. In both Jatropha and Castor bean, the contigs showing multiple matches were manually curated to cover the entire query length. The query genes mapped on to the respective genome contigs showed identity in the range of 95–100 %. In case of Jatropha according to location, NBS-LRR genes were classified into three categories. First category had 28 genes, each of them were located in a single contig without any disruption in the coding sequence, thereby suggesting that these genes lacked introns. Second category had 7 genes and each were mapped on to single contigs with one or more gaps indicating insertion which may correspond to a intron. Third category comprised of 12 genes and each of these genes had match in more than one genomic contigs which implied that these genes were transcribed from different location and may have introns (Supplementary file 3). Out of 47 genes, 16 genes were found in cluster of two i.e. these 16 genes were present in 8 genomic contigs (Jc476461637, Jc476470256, Jc476481852, Jc476483387, Jc476485273, Jc476487282, Jc476487650 and Jc476489371) (Supplementary file 3). Similarly in Castor bean, total 47 NBS-LRR genes were further classified into three categories according to location on the genome. In first category, there were 32 genes and each of them was located in a single contig without any insertion that means these genes lacked introns. Second category comprised 20 genes and each was mapped onto single contig with one or more gaps indicating insertion which may correspond to the intronic region. Third category had 5 genes with each having match in more than one contig indicating that these genes were transcribed from different locations and may have introns (Supplementary file 4). Clustered NBS-LRR genes were also observed in case of Castor bean, 4 genes were located on to a single contig (Rc124357167) i.e. were clustered together and 6 genes were found in cluster of two i.e. were present in 3 contigs (Rc124350718, Rc124354636 and Rc124357119) (Supplementary file 4).

Identification of common and unique NBS-LRR genes and transcription factors between Jatropha and Castor bean genomes

The identified 47 genes in Jatropha and Castor bean were aligned in order to analyze the common and unique NBS-LRR genes between Jatropha and Castor bean. This analysis was performed using BLASTN with cut off values i.e. identity >70 % and length of 100 bp. In order to identify common and unique genes, the genes from Castor bean were taken as database whereas the genes from Jatropha were taken as query. In case of Jatropha, 7 genes showed identity to Castor bean genes whereas in Castor bean 8 genes showed identity to Jatropha genes implying that 7 and 8 NBS LRR genes are common between Jatropha and Castor bean, respectively (Table 4). It was found that in case of Castor bean, out of 8 genes, 6 showed identity to each and specific gene from Jatropha whereas 2 genes from Castor bean showed similarity to 1 common (same) gene from Jatropha. Further 40 and 39 genes were uniquely present in Jatropha and Castor bean genomes, respectively. In case of transcription factors (TF), castor bean TFs were taken as database whereas the TFs from Jatropha were taken as query. 70 transcription factors were found common in both Jatropha and Castor bean (Table 5). Further, 52 and 255 transcription factors were found to be uniquely present in Jatropha and Castor bean, respectively.

Table 4 Common NBS-LRR genes between Jatropha and Castor bean genomes
Table 5 Common disease resistance transcription factors between Jatropha and Castor bean

Transcript abundance of NBS-LRR genes and transcription factors associated with disease resistance

RSEM was used for the transcript abundancy measurements of identified set of NBS-LRR genes and transcription factors associated with the disease resistance mechanism. The expression profiles were obtained through pme_TPM (pme: Posterior mean estimates; TPM: transcripts per million) values using RSEM software package. In RSEM, posterior mean estimate (pme) is computed for each gene and isoform abundance, with a maximum likelihood (ML) estimate [62]. pme_TPM for both Jatropha and Castor bean transcriptome samples were generated. Expression profile of 47 NBS-LRR genes and disease resistance specific transcription factors was mined in both the species. The pme_TPM values of genes ranged between 0.4–133.54 and 2–62.84 in Jatropha and Castor bean, respectively (Supplementary files 5, 6). In case of Jatropha out of 47 genes, gene with id contig_14680 showed highest pme_TPM value i.e. 133.54 as compared to Castor bean where highest pme_TPM value was 62.84 from the gene NODE_36679. When common genes between Jatropha and Castor bean were analyzed, it was found that in Jatropha gene contig_10121 showed highest pme_TPM value of 9.28 while its corresponding gene in Castor bean i.e. NODE_57743 showed pme_TPM value of 3.12. In case of Castor bean, gene NODE_34103 showed the highest pme_TPM value i.e.10.46 as compared to Jatropha where the corresponding gene contig_00810 showed value of 0.64. The pme_TPM values for transcription factors ranged from 0.42 to 289.67 and 1.74 to 237.15 for Jatropha and Castor bean, respectively (Supplementary file 7, Supplementary file 8). Similarly, on analyzing the transcript abundancy of transcription factors it was found that in case of Jatropha the transcription factor JcTF_15319 showed highest pme_TPM value i.e. 289.67 and in case of Castor bean the transcription factor RcTF_32546 showed highest value (237.15) of pme_TPM. On analyzing the common transcription factors between Jatropha and Castor bean, it was found that in Jatropha the transcription factor JcTF_14789 showed highest pme_TPM value of 142.48 whereas its corresponding transcription factor in Castor bean RcTF_20625 showed a value of 9.05. In case of Castor bean, the transcription factor RcTF_32546 gave highest pme_TPM value of 237.15 while its counterpart in Jatropha i.e. JcTF_04420 showed a value of 1.93. Further it was observed that 4 transcription factors showed higher pme_TPM values in Jatropha (range from 100 to 300) i.e. JcTF_14789, JcTF_14930, JcTF_15218 and JcTF_15319. Also in castor bean 5 transcription factors showed higher pme_TPM values (range from 100 to 300) i.e. RcTF_29450, RcTF_10530, RcTF_2255, RcTF_33998 and RcTF_32546.

Identification of CNLs and TNLs in identified NBS-LRR genes

For the prediction of CNLs and TNLS in the identified transcripts, PCOILS used sliding windows of 14 (green), 21 (blue), and 28 (red) and predictions were made based on coiled coil probability. In case of Jatropha, out of 47 identified NBS-LRR genes, 37 were predicted as TNLs and 10 were CNLs. Similarly in case of Castor bean, out of 47 identified NBS-LRR genes, 28 were predicted as TNLs and 19 were predicted as CNLs.

Domain architecture of NBS-LRR genes in Castor bean and Jatropha genomes

The disease resistance gene sequences of both the plant genomes were downloaded from the NCBI. The predicted disease resistance genes were reported as 121 and 91 in Castor bean and Jatropha, respectively. The genes were further subjected to query against their respective databases using BLASTN and BLASTP. Protein function domains and distribution of motifs in the disease resistance genes were predicted by using the NCBI Conserved Domain search along with HMM search using Pfam. MAST was used for predicting the distribution of motifs. These programs were appropriate for only defining the occurrence or absence of TIR, NBS, and LRR domains, but they were unable to identify more dispersed patterns or smaller individual motifs, like as present in the CC domain. By using PCOILS, the disease resistance proteins were characterized into CNLs and TNLs. Out of 121 disease resistance genes of Castor bean, 80 were predicted as TNLs, and 41 as CNLs. Similarly, out of 91 disease resistance genes of Jatropha, 54 were predicted as TNLs, and 28 as CNLs. No significant result was obtained for 9 other genes.

Organization of disease resistance genes in Castor bean and Jatropha genomes

In Castor bean, 121 disease resistance genes were distributed among 121 contigs. Similarly, in Jatropha, out of 91 genes, 82 genes were distributed among 82 contigs and no significant result was found for 9 other genes. Further analysis revealed that 7 of the disease resistance genes present in Castor bean genome, viz. XM_002517562.1, XM_002517561.1, XM_002518665.1, XM_002517526.1, XM_002517548.1, XM_002521759.1, and XM_002529578.1 showed similarity to Jatropha genome, corresponding to genes XP_002517608.1, XP_002517607.1, XP_002518711.1, XP_002517572.1, XP_002517594.1, XP_002521805.1, and XP_002529624.1, respectively (Supplementary Tables S1, S2). Due to the presence of same domains and motifs, the genes were further clustered in varying sizes, comprising 2–4 genes in most clusters. Although both these plants showed almost similar type of motifs (Kinase 1, Kinase 2, Kinase 3, GLPL, MHDL, and AAA+) which were found to be conserved in their disease resistance genes, certain differences were also observed with respect to the presence of conserved domains, which included presence of dirigent domain/superfamily along with protein kinase domain in Castor bean genome, and RPW8 domain/superfamily which was found to be unique to Jatropha genome (Table 3).

Discussion

Castor bean and Jatropha are considered as promising biofuel crops. Commercial cultivation of selected genotypes of both these plant species has predisposed to a plethora of biotic stresses, including insect pests and fungal, viral and bacterial diseases. No systematic breeding efforts have been made towards the development of disease resistant genotypes in both the plant species. Since a large numbers of pest and disease resistance genes in various plant species belong to NBS-LRR family of proteins, which is highly conserved across kingdoms, there was a requirement to analyze and characterize disease resistance genes, particularly NBS-LRR genes and the defense related transcription factors in both these plants.

Since the previously predicted NBS-LRR genes in Jatropha and Castor bean [47, 48] are quite small in number in comparison to other sequenced plant genomes with same range of genome sizes (For example, the genomes of A. thaliana and V. vinifera contain relatively higher number of NBS-LRR genes (ranging from 174 to 535), even though their genome sizes are in the order of 125 and 487 Mb, respectively [65]. We identified 47 new NBS-LRR genes in the transcriptomes of Jatropha and Castor bean from the available transcriptomes, while earlier identification of NBS-LRR genes was done through genome mining which may contain pseudogenes [47, 48]. The identified NBS-LRR genes in Jatropha and Castor bean were also mapped on to the respective genomes to have a clue about their physical location [66]. Some of the NBS-LRR genes can be frequently clustered in the genome due to segmental and tandem duplication [49, 67]. Consistent with these findings, some researchers have identified presence of NBS-LRR genes in clusters in Jatropha as well as Castor bean. In case of Jatropha, 16 genes were found in clusters of two genes whereas in castor bean 6 genes were in binary cluster and 4 genes were clustered in single contig. Our results indicate that there is more clustering in case of Jatropha as compared to Castor bean which may support the concept of novel resistance specificities through recombination or gene conversion and also rapid R gene evolution in Jatropha [40, 49]. Moreover the NBS-LRR genes present in clusters can be primarily targeted for breeding to develop disease resistant varieties. In both cases, several NBS-LRR genes were mapped with gaps which represent the presence of intronic region in these genes and is in consonance with the fact that most of the eukaryotic genes comprised of introns [68]. Further these intronic regions can be explored in spliced site studies for disease resistance [6971].

Further, a comparative analysis between Jatropha and Castor bean showed common and unique NBS-LRR genes. It was found that 7 and 8 NBS-LRR genes were common between Jatropha and Castor bean, respectively. In Castor bean, out of 8 genes, 6 showed identity to each specific gene from Jatropha whereas 2 genes from Castor bean showed similarity to 1 common gene from Jatropha. The results are in line with the previous analysis of NBS-LRR genes and resistance gene analogues (RGAs) in Sweet potato and Arabidopsis which support the concept of recent duplication or have been conserved devoid of significant divergence [72, 73]. Common transcripts/genes can be targeted in a cross generic or cross specific manner for enhancing the disease resistance potential of Jatropha and Castor bean [40, 54, 74]. The common genes identified from both organisms implies that these are conserved in nature and may be responsible for providing resistance to general disease conditions not specific to any particular pathogen.

The transcript abundancy was measured for newly identified set of NBS-LRR genes with the help of in silico expression analysis in order to support the identification of transcripts and their expression levels. A high variation was found in the expression values of identified genes in both Jatropha and Castor bean. The genes showing higher values of expression with more transcript abundance can be used to design and conduct the experiments for providing enhanced resistance to disease and pest conditions in Jatropha, Castor bean and related economically important plants of same family such as Rubber tree, Cassava, etc. [40, 74]. The identified NBS-LRR genes were further characterized into TNLs and CNLs and in both cases the number of TNLs were more compared to CNLs, as the TNLs were confined only to dicots [75]. Further, these N terminal domains i.e. TIR (TNLs) and CC (CNLs) were responsible for pathogen recognition which supports the resistance potential of the associated genes [76].

These investigations are the first attempt to identify transcription factors related to disease resistance or defense response in whole transcriptomes of Jatropha and Castor bean where 122 and 318 transcription factors were identified, respectively. Many of the transcription factors are being implicated in maintaining transcriptional reprogramming linked with plant defense and resistance response. An association among activating and repressing transcription factors from many families control the defense response expression of the target genes [77]. Transcription factors such as WRKY, bZIP, ERF, MYB and Whirly families bind to the promoters of the resistance genes and regulate expression level [34, 35, 3739]. In comparison to conventional screening of cDNA libraries or EST sequencing, the computational transcription factors discovery approach provides fast, simple, consistent and precise methods to reveal the transcription factor families specific to disease resistance and defense response at both the whole genome and transcriptome levels. In case of Castor bean, the number of identified transcription factors related to the defense response is about 3 times more as compared to Jatropha as evident from their transcriptomes size. The comparison between the identified transcription factors was made between Jatropha and Castor bean in order to elucidate the common and unique number of transcription factors which showed that a large number of transcription factors (70) are common between Jatropha and Castor bean which also support the fact of either recent duplication or conserved defense response mechanism in Jatropha and Castor bean [7880].

In the past, many transgenic crop and model plants with improved disease resistance have been developed [81] by over expressing the defense related transcription factors. Over expression of WRKY and ERF transcription factors have resulted in developing disease resistant varieties of many plants [82]. Over expression of the defense associated transcription factors can provide resistance to many dissimilar pathogens also. Arabidopsis transcription factor MYB30 over expression has resulted in enhanced resistance to pathogenic bacteria and fungus in transgenic Arabidopsis and Tobacco [83]. Identification of transcription factors related to defense response or disease resistance is also of great significance in predicting the pathogen responsive promoter elements. Only a few pathogen responsive elements in the promoter regions have been identified. One most cited example is the presence of W-box in the promoter region of various genes activated by WRKY transcription factors [8486]. In both Jatropha and Castor bean only 4–5 transcription factors showed the higher transcript abundance which signifies their role as potential targets for achieving or providing disease resistance. Those transcription factors can be considered on primary basis for manipulation of the genes associated to them to develop the resistant lines of Jatropha and Castor bean. A comparative study of varying expression profiles or variations in transcript abundance measurements of NBS-LRR genes and transcription factors associated to disease resistance between both the transcriptomes revealed that some NBS-LRR genes and transcription factors can be good candidates for enhancing the resistance potential of Jatropha and Castor bean. By using comparative analysis, the exploration of evolutionary fate of the NBS-LRR genes and transcription factors in the Euphorbiaceae family and the understanding of disease resistance between the important family members is anticipated [59].

A total of 121 disease resistance genes were predicted in the current version of Castor bean genome [47] in which 80 genes have been classified into TNLs and 41 into CNLs, which are the two important subfamilies of NBS-LRR proteins in plants, having Toll/interleukin-1 receptor (TIR) or coiled-coil (CC) motifs in the amino-terminal position of domain and also the NBS-LRR genes represent ~0.4 % of all identified ORFs. Similarly, 91 disease resistance genes were predicted in Jatropha genome [48] in which 54 have been predicted as TNLs and 28 into CNLs and also the NBS-LRR genes represent ~0.3 % of all predicted ORFs. Since the CNLs and TNLs are both involved in pathogen recognition [76], the prediction and classification of NBS-LRR proteins into CNLs and TNLs further support the disease resistance potential. The presence of TNLs is known exclusively only for dicots not for monocots [75] which further support the motifs prediction as Jatropha and Castor bean both are dicotyledonous species. These results are in accordance with the previous classification of TNLs and CNLs for the novel identified NBS-LRR genes.

The detailed analysis revealed that 7 of the disease resistance genes present in Castor bean genome showed similarity to Jatropha genome, signifying that these genes emerged from the recent duplication or have been conserved devoid of significant divergence, as was found for NBS-LRR genes and RGAs in Sweet potato and Arabidopsis earlier [72, 73]. Furthermore, 60 % gene clustering was observed in both these plant species and the genes which were present in clusters consisted of same domains and motifs. Similar kind of motif patterns were observed in both these plants which also corroborates the concept of synteny [87], but certain differences with respect to the presence of conserved domains were also observed between two plant species, which included presence of dirigent domain/superfamily along with protein kinase domain in Castor bean genome, and RPW8 domain/superfamily in Jatropha genome.

The NBS-LRR genes, the defense related transcription factors predicted in this study and domain architecture of previously identified NBS-LRR genes will supplement the disease resistance knowledge pool in both the bioenergy plant species so that better breeding and genomics-based interventions can be made for developing disease resistant varieties. Further, these in silico based analysis and comparison of NBS-LRR genes and transcription factors between Jatropha and Castor bean will reveal specific insights on the function, organization, conservation and evolution of the NBS–LRR resistance genes and defense response related transcription factors in related members of family Euphorbiaceae.

Conclusion

The study has led to the identification of 47 new NBS-LRR genes, in addition to 91 in Jatropha and 121 genes in Castor bean and 122 and 318 disease resistance specific transcription factors in Jatropha and Castor bean respectively, for the first time. The outcome of study is, therefore, of great practical importance in two major oilseed crops of industrial value. Since Jatropha and Castor bean are becoming susceptible to various diseases and biotic stresses, current findings can be used in the development of candidate gene markers intended for molecular breeding of disease resistance. The transcription factors specific to resistance or defense response can be targeted to engineer disease resistant varieties of Jatropha and Castor bean which share taxonomical and biochemical similarity.

Data archiving statement

The sequences of NBS-LRR genes and transcription factors are available at the following link:http://sites.google.com/site/combiogroup/datadownload.