Introduction

Plants represent the major component of earth’s biota and are capable of producing their food through the process of photosynthesis. A multitude of pathogens cause loss in the crop yield every year which is often severe across the globe. Plants are equipped with a variety of defense mechanisms to protect themselves against the attack of pathogens. Some of these are constitutive while others are induced upon the attack by pathogens. The interaction of plant and pathogens induces a variety of defense mechanisms which include cell wall strengthening [1], de novo production of antimicrobial compounds (pathogenesis response proteins) and secondary metabolites [2, 3]. In the case of pathogenesis related proteins (PR), chitinase and glucanase play a crucial role since they attack directly on the fungal and insect structural component whereas enzymes of plant secondary metabolite pathway including Chalcone synthase [4] and Phenylalanine ammonia lyase [5] play significant roles due to antimicrobial nature of secondary metabolites.

Chitinases (EC 3.2.1.14), which are found in a wide range of organisms catalyse the hydrolysis of chitin and play a major role in plant defense mechanism against fungal pathogens. Chitinase catalyses the hydrolysis of β-1-4-linkage of the N-acetylglucosamine polymer of chitins, a major component of fungal cell walls [6]. Plant chitinases usually have a wide range of optimum pH (pH 4–9) and are generally stable at temperature up to 60 °C [7]. These enzymes usually have a molecular weight ranging from 25 to 35 KDa. They are usually involved with both active and passive defense against pathogens [810]. These enzymes also regulate growth and development by generating or degrading signal molecules [1113] through programmed cell death (PCD) [14, 15]. Production of chitinases is regulated by a variety of stress factors, both biotic and abiotic, including infection, wound, drought, cold, ozone, heavy metals, excessive salinity and UV light [7, 9, 1618]. In addition phytohormones, such as ethylene, jasmonic acid, salicylic acid, auxin and cytokinin, induce chitinase expression [19].

Several workers have proposed different classification schemes of plant chitinases and the classification scheme has undergone several modifications. Based on the hydrolytic sites, plant chitinases are broadly classified into two categories, endochitinases and exochitinases. With regard to physicochemical properties and enzymatic activity plant chitinases are classified as PR proteins [7]. Based on biological properties, enzyme activity and coding sequence similarities, chitinases are represented by four PR proteins designated as PR-3, PR-4, PR-8, and PR-11 [20]. Based on the amino acid sequence homology, structure, substrate specificity, mechanisms of catalysis and sensitivity to inhibitors they are classified into seven classes, i.e. I-VII [21]. Class I is further divided into two sub classes, Ia and Ib. PR-3 includes chitinases of class Ia, Ib, II, IV, VI and VII, chitinases of class III belong to PR-8, and chitinases of class V to PR-11. Among the chitin binding proteins, protein with low endochitinase activity form the PR-4 class [22].

Class I chitinases have a cysteine-rich chitin-binding domain (CBD) at N-terminal and a C-terminal catalytic domain (CatD). The CBD is linked to the CatD by a proline and glycine rich linker, which varies in its length and composition [20]. In contrast class II chitinases lack the N-terminal CBD domain and linker region but they shows high sequence homology with CatD of class I chitinases. Class III chitinases shows lysozyme activity and do not reveal any sequence homology to either class I or II chitinases. All plant chitinases from this class show good percentage of sequence similarity among themselves but differ widely in their isoelectric point [23]. Class IV chitinases have low sequence similarity with class I chitinases; they contain a CBD and CatD which resembles that of class I chitinases. Both domains of class IV chitinases are significantly smaller than those of class I because of one deletion in the CBD and three deletions in CatD. Class V chitinases possess two CBDs in tandem [24, 25]. Heavily truncated CBD along with a proline rich spacer is dominant in class VI chitinases [26]. Class VII chitinases lacks the CBD but possess CatD, which is homologous to Class IV chitinases [27].

Furthermore, chitinases are classified into two families of glycoside hydrolases (GH), families 18 and 19, in which glycoside hydrolases are divided into more than 110 families based on the amino acid sequence similarity of their catalytic domains [28, 29]. The members of two different families differ in their amino acid sequences, three-dimensional (3D) structures, and molecular mechanisms of catalytic reactions [30], and are thus considered to have different evolutionary origins. GH family 18 chitinases [class III and V] are widely distributed in a variety of organisms, such as bacteria, fungi, viruses, animals and higher plants. The distribution of GH family 19 [class I, II, IV, VI, VII] enzymes is more restricted and they are mainly found in higher plants and some bacteria [28, 31].

Rice (Oryza sativa L.) the model plant represents the monocotyledons whose genome was completely sequenced in 2004. Genome annotation study of rice reveals that chitinase genes are present in all chromosomes except chromosome number seven [32]. Rice possesses several classes of chitinases encoded by different genes located in different chromosomes. Although the expression of these genes is differentially induced and regulated, they act both directly and indirectly in plant defenses as well as are associated with numerous roles in plant physiological function. Several lines of evidence reflect that chitinases expression and enzyme activity has a major contribution in disease resistance against fungal pathogens in rice. Transgenic plants which constitutively expressed a rice class-I chitinase gene, Cht-2 or Cht-3, showed significant resistance against two races of Magnaporthe grisea [33]. The activity of chitinase in the transgenic plants, overexpressing a rice chitinase gene, was found to be correlated to the levels of enhanced resistance against Rhizoctonia solani [34]. The study on anti-fungal properties of class I and class II chitinases showed that class-I have three to five times higher activity than that of class-II [28]. At pH 3–5 it was shown that purified rice basic class III chitinase, expressed in Pichia pastoris, is an effective lytic agent of Micrococcus lysodeikticus, but shows a weak fungal inhibition towards Trichoderma reesei [35]. Study on class I and class IV chitinases in rice showed that both possess a similar catalytic domain and had similar N-acetyl-chitin-oligosaccharide degradation efficiency [27]. These evidences confirm the hypothesis that rice chitinases play an important role in fungal resistance.

Six crystal structures are available for GH-19 chitinases from plant origin in protein databank (PDB). These include barley (class-II), jack-bean (class-II), mustard (class-I), papaya (class-II), Norway spruce (class-IV) and rice (class-I). Rice possesses several family 19 chitinases. To date only the rice class I chitinase OsChia1b, also referred to as RCC2 or Cht-2, which has been reported in PDB [PDB accession code: 2DKV] [28]. The reports on 2DKV reveals that this chitinases is comprised of two domains (N-terminus CBD and C-terminus CatD) which are interconnected by a linker peptide rich in proline and threonine amino acids. Kezuka et al. also reported that the CBD of 2DKV binds to chitin which acts as an anchor whereas CatD degrades the chitin chain depending upon the linker length. As rice posses several classes of chitinases, in the present study we have selected a few reviewed representative chitinases of classes I, II and IV. An attempt was made to elucidate the structure, function and evolution of nine rice (Oryza sativa L.) chitinases belonging to family-19 using comparative proteomic approach with the aid of high-throughput computational tools. This study will provide an insight into the structural variations, evolution and molecular function of different classes of chitinase in rice.

Methods and materials

Software used

  • MEGA5.0

  • Modeller9v9

Sequence retrieval and multiple sequence alignment

The reviewed fasta sequences of chitinase enzyme (EC 3.2.1.14) of rice (Oryza sativa L.) were retrieved from the UniProtKB (www.uniprot.org/help/uniprotkb) database. In this study a total of 12 chitinase sequences were selected and nine of them were analysed. Information about the twelve sequences is depicted in Table 1. The multiple sequence alignment was produced using ClustalW2 at EMBL-EBI (http://www.ebi.ac.uk/Tools/msa/clustalw2/) [36].

Table 1 Rice chitinase sequences analysed in this study

Phylogeny analysis

For the construction of molecular evolutionary genetic tree, a total of 74 different plant chitinase sequences (reviewed) were retrieved from UniprotKB database and were aligned in ClustalW. Neighbour-joining method [37] was used for the construction of phylogenetic tree using MEGA5.0 [38]. The level of confidence was estimated using bootstrap of 1000 replications.

Physico-chemical property analysis of chitinases

For the elucidation of physico-chemical properties of chitinases, ProtParam tool (http://expasy.org/cgi-bin/protparam) [39] of Expasy Proteomic Server was used. The theoretical isoelectric point (pI), molecular weight, extinction coefficient, instability index, aliphatic index and grand average hydropathy (GRAVY) was calculated.

Secondary structure and disorder region prediction

The secondary structures of chitinases were predicted using PSIPRED server (http://bioinf.cs.ucl.ac.uk/psipred/) [40]. This is a simple and accurate secondary structure prediction method that incorporates two feed-forward neural networks which perform an analysis on output obtained from PSI-BLAST. SOPMA (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html) [41] and GOR IV (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html) [42] servers were also used to predict the secondary structures. The disorder regions of rice chitinases were predicted by protein disorder meta-prediction server (metaPrDOS) (http://prdos.hgc.jp/meta/) [43].

Domain analysis and linker prediction

For characterisation and understanding of protein function, detailed knowledge of protein domain boundaries and architecture is essential. In the absence of known 3D structures the delineation of domain boundaries of a given sequence benefits many areas in protein science, such as protein engineering and protein structure prediction [44]. We used comparative domain boundary prediction method called SBASE (http://hydra.icgeb.trieste.it/sbase/) [45] for domain boundary prediction. This method exhaustively searches the sequence against known domain definitions within the associated domain database. Domain boundaries along with domain contents are predicted and thus this can be useful for the identification of protein domain architecture. Linkers are the sequence region between defined structural domains. Linker regions are usually non-globular, un-structured or a low complexity segment that is flexible in 3D-space, but studies showed that linker region may significantly affect the cooperation and interaction between domains and therefore alter the overall functionality and efficiency of multiple domain proteins [46]. The linker sequences joining the two discrete domains of chitinases were delineated manually.

Comparative modelling

Template identification

In order to find suitable templates for the comparative modelling of chitinase, the target sequences of chitinases were searched for similar sequences using the basic local alignment search tool (BLASTP) [47] against PDB. The Blosum-62 matrix was used with a default threshold E-value of 10 and inclusion threshold value of 0.005. Templates were selected based on the query coverage, sequence identity, less E-value and structural resolution.

Model building and refinement

The theoretical structure of rice chitinases were built using MODELLER-9v10 [48]. Modeller implements comparative protein structure modelling by satisfaction of spatial restraints. For 3-D model building of each chitinases the target sequences were aligned with their respective templates in Modeller. In this approach single-template and multiple-template based methods were used to build the 3-D structures. The models generated were subjected to loop refinement in Modeller.

Model evaluation

The qualities of the refined models were evaluated both geometrically and energetically by a series of tests for its internal consistency and reliability. We used PROCHECK [49], WHAT_CHECK [50], ERRAT [51] and VERIFY_3D [52, 53] tools embedded in the structure analysis and validation server (SAVES) (http://nihserver.mbi.ucla.edu/SAVES/). PROCHECK analyses the Ramachandran plot quality, peptide bond planarity, non bonded interactions, main chain hydrogen bond energy, Cα chiralities and overall G factor. The non bonded interactions between different atom types were checked by ERRAT. Verify_3D was used to access the compatibility of the atomic model with its own amino acid sequence. A high Verify_3D profile score indicates the high quality of protein model. The Protein Structure Analysis (ProSA) (https://prosa.services.came.sbg.ac.at/prosa.php) [54] tool was employed in the refinement and validation of modelled structures. The root mean square deviation (RMSD) between the main chain atoms of models and respective templates were calculated by structural superimpositions of predicted structures with their respective templates using iPBA web server (http://www.dsimb.inserm.fr/dsimb_tools/ipba/index.php) [55]. To sum up the geometry, non-bonded interaction of atoms, energy profiles and RMSD of most of the models of rice chitinases are reasonable and reliable for further study.

Function prediction

The 3d2GO server (http://www.sbg.bio.ic.ac.uk/phyre/pfd/index.html) was used to predict functions of the validated models using sequence and structure in the reference of gene ontology (GO). Various sources of information like sequence homology to functionally annotated sequences, overall topological similarity to structures with known function and geometric and residue similarity of predicted functional sites to regions of known structures were used.

Identification of functional surface

For each modelled chitinases, we used the CASTp server (http://sts.bioengr.uic.edu/castp/calculation.php) [56] to identify their functional surfaces, which is taken as the surface pocket containing annotated binding site residues. If there exist more than one such pocket on a protein structure, only the largest one is selected, as the largest pocket often corresponds to enzyme binding site [57].

Results and discussion

Twelve reviewed sequences of rice chitinases were retrieved from UniProtKB database belonging to class I, II and IV. Out of these sequences, chitinase1, 2, 3, 7, 9, 10, 12 belong to class I, chitinase8 and 11 belong to class II whereas chitinase4, 5 and 6 belong to class-IV sub-family. All of them are referred to as PR-3 protein, belonging to GH family 19. Multiple sequence alignment was constructed in ClustalW which is shown in Fig. 1. The results showed that class I and class IV chitinases of rice have two common domains, i.e. N-terminal cysteine rich chitin binding domain (CBD) and a C-terminal catalytic domain (CatD). The aromatic amino acids belonging to CBD of class I and IV chitinases are highly conserved. Though chitinase10 belongs to class I, it lacks the N-terminus CBD which is a conserved feature of class I and class IV chitinases. CBD of class IV has a deletion of approximately seven amino acids residues in the N-terminal region. CatD possess several deletions of around 35 amino acids in the C-terminal region which may have resulted in the small molecular size of class IV chitinases in comparison to class I chitinases. Class II chitinases of rice are homologous to the CatD of class I and IV chitinases which lack the N-terminus CBD. Due to the lack of N-terminus CBD and few deletions in the CatD of class II chitinases, the molecular size of class II chitinases is smaller than that of class I chitinases whereas there is a high level sequence similarity within the class. The two domains (CBD and CatD) of class I and class IV chitinases are interconnected by glycine, proline and serine rich linker peptide sequence.

Fig. 1
figure 1

Amino acid sequence alignment class I (1,2,3,7,9,10,12), class II (8,11) and class IV (4,5,6) chitinases of rice. Sequence alignment was performed in ClustalW. Aromatic amino acid residues conserved in all sequences are indicated by orange, cysteine residues in cyan, serine and threonine are coloured in green. The two domains (CBD and CatD) are highlighted in square boxes

A total of 74 sequences of chitinase from different plant species were selected and aligned in ClustalW for phylogeny analysis. The alignment was further used for phylogenetic tree construction using NJ method with a bootstrap value of 1000 in MEGA5.0. The consensus tree showed dichotomy with two different clusters (cluster I and II) having strong bootstrap value >70 with their nodes (Fig. 2a). On the basis of all chitinases analysed in the present study it may be surmised that the main clusters of chitinases in monocotyledons and dicotyledons differ from each other. Within the clusters of monocots their exist sub clusters (sub cluster Ia and Ib), indicating that the chitinases in the same cluster are still evolving in different plants (Fig. 2b). The phylogeny analysis showed that despite sequence divergence among all chitinases of rice, they have evolved from a common ancestor and are conserved throughout the evolutionary process.

Fig. 2
figure 2figure 2

(a) Circular tree showing the evolutionary origin and evolutionary relationship among different plant chitinases. (b) Phylogenetic tree inferred from neighbour-joining method showing dichotomy with two different clusters with their sub-clusters

The physico-chemical properties of rice chitinases were computed using Protparam tool of Expasy proteomic server is tabulated in Table 2. The molecular weight of the selected rice chitinases ranges between 27–35 KDa. Isoelectric point (pI) is the pH at which the surface of protein is covered with charge but net charge of protein is zero. At pI proteins are stable and compact. Computational results showed that chitinase3, 8, 9, 11 and 12 are acidic in nature (pI <7.0) whereas chitinase1, 5, 7 and 10 are basic in nature (pI >7.0). The computed isoelectric point (pI) will be useful for developing buffer system for purification by isoelectric focusing method. The aliphatic index (AI) which is defined as the relative volume of a protein occupied by aliphatic side chains is regarded as a positive factor for the increase of thermal stability of globular proteins [58]. Aliphatic index of chitinases ranges between 50.90-67.82. The very high aliphatic index of chitinases indicates that rice chitinase proteins may be stable for a wide range of temperature. The instability index provides an estimate of the stability of protein in a test tube. There are certain dipeptides, the occurrence of which is significantly different in the unstable proteins compared with those in the stable ones. A protein whose instability index is smaller than 40 is predicted as stable, a value above 40 predicts that the protein may be unstable [59]. Except chitinase10, it was predicted that all chitinases are stable in nature showing an instability index <40. The grand average hydropathicity (GRAVY) value for a peptide or protein is calculated as the sum of hydropathy values of all the amino acids, divided by the number of residues in the sequence [60]. GRAVY indices of chitinases range from −0.107 to −0.453. This low range of value indicates the possibility of better interaction with water.

Table 2 Physico-chemical Properties of chitinases

Secondary structures of chitinases were predicted using PSIPRED, SOPMA and GOR IV servers. The comparative study from all these servers showed that in all classes (I, II and IV) of chitinases investigated in the present study random coils dominated among secondary structure element followed by helices and strands which are shown in Table 3. Although these servers use different algorithms and approaches to predict secondary structure elements from primary amino acid sequences, results obtained from theses servers are approximately the same and are considered for further investigation.

Table 3 Secondary structures statistics predicted from different servers

The metaPrDOS server was used to predict natively disordered regions of rice chitinases from its amino acid sequences. Proteins often in their native states have regions with very flexible and unstable structures, treated as disordered region which are involved in many biological processes such as regulation, signalling and cell cycle control [61, 62]. Disordered regions seem to be the molecular recognition site of proteins or DNA [63, 64]. During the interaction with the ligands, it is frequently observed that disordered regions transit to order where the flexibility of the region provides the high specificity and low affinity towards multiple partners [65]. Therefore, it is quite necessary to identify the disordered regions of target proteins from their amino acid sequences. The results from this server revealed that few residues from the beginning of N-terminal regions and some residues at the end of C-terminal region fall in the disordered region. In addition, a stretch of consecutive residues in between the two termini of chitinases (the linker portion) fall in the disordered region. In all the classes (class I, II and IV) of rice chitinases the disordered regions are dominated by higher frequency of hydrophilic, charged residues, low sequence complexity regions and residues involved in phosphorylation (Serine, Threonine and Tyrosine). The residues predicted from the disordered region and their position are depicted in Table 4.

Table 4 Residues involved in the disordered region of rice chitinases

The domain boundaries of chitinases were predicted using SBASE server. It predicts domain boundaries as well as domain content and thus can be used for the identification of protein domain architecture. The results from this study showed that chitinase1, 3,5,7,9 and 12 are comprised of two domains namely chitin binding type-1 like domain and glycoside hydrolase, family 19-like domain whereas chitinase8, 10 and 11 possesses a single glycoside hydrolase, family 19-like domain. The typical domains of rice chitinases and their positions in the sequences are listed in Table 5. Discrete domains are often associated with multiple function of protein where domains are connected by inter-domain linkers. The linker region that is flexible in 3D-space, which may significantly affect the cooperation and interaction between domains, alter the overall functionality and efficiency of multiple domain proteins. They keep the domains apart and provide great extent of flexibility to move individually which is a part of their catalytic function. As chitinases possess discrete domains, it is important to predict the linker sequence which joins the adjacent domains present in it. The linker regions were manually curated and are reported in Table 5. We have analysed the amino acids propensities in linkers and examined their order of residues within linkers. The amino acid glycine (G), proline (P) and serine (S) are more predominant in these linker regions which may provide flexibility to the two discrete domains present in rice chitinase to act independently. It was also reported that penta-peptides consisting of Gly, Ser and Thr would make the best linkers for gene fusion; as these residues were most strongly preferred within natural linkers [66].

Table 5 Domains and inter-linking linker of rice chitinase predicted using SBASE server

Comparative modelling of protein is considered as one of the most accurate methods for 3D structure- function prediction, yielding suitable models for wide spectrum of applications [67]. It is usually a method of choice when a clear relationship of homology between the sequences of target protein and at least one known structure is found. The approach would give reasonable results based on the assumption that the tertiary structure of two proteins will be similar if their sequences are related [68]. Rice possesses several family19 chitinases but only class I chitinase (OsChia1b), referred to as Cht-2, has been reported in PDB (PDBID: 2DKV) to date. Therefore it prompted us to construct the homology models of different classes of Family 19 chitinases of rice, the structures of which have not been reported. Templates were retrieved by performing BLAST search against PDB. Templates were selected based on the query coverage, sequence identity, less E-value and structural resolution. For comparative modelling the single-template and multi-template approaches were used. 2DKV and 2Z39 were considered the best templates for model building of chitinase1, 2, 9 and 12 as they showed high percentage of identity with the query sequences, 2UVO and 3HBD was chosen as template for chitinase5. 2BAA was used as template for chitinase11. For chitinase8, and 10 multiple templates were selected as it is recommended to use multiple templates (when available) to avoid biasing the models towards one protein or one set of side chain conformations [69, 70].

Modeller9v10 was used to build the three-dimensional models of chitinases based on the target-template alignment. Modeller generated five predicted structures for each rice chitinase. The models with the lowest discrete optimized protein energy (DOPE), (a statistical potential used to assess homology models) scores were considered to be thermodynamically stable and chosen for further refinement and validation. The models with amino acids in the loop region were subjected to loop refinement in Modeller using loop_refine.py script. The loop refinement script generated five different models of chitinase, the models with lowest energy were chosen for further study. The total energy of models was calculated by GROMOS96 force field and energies of the models before and after refinement in Modeller are depicted in Table 6. Decrease in force field energies after refinement confirms that models were refined. Minimum energy values possessed by almost all models compared to templates indicates stability of all models.

Table 6 Model validation scores and energy of the models

The overall stereo-chemical quality and accuracy of the predicted models was evaluated using Ramachandran plot (Fig. 3) in Procheck. The refined models showed good percent of residues in most favoured regions, additional allowed regions and generously allowed regions (depicted in Table 7). Absence of residues from disallowed regions in chitinase (except chitinase5 and chitinase7) supports its high geometric quality. Though chitinase 5 and chitinase 7 models have one residue falling in disallowed region of Ramachandran plot, they do not interfere with active sites, hence these models are acceptable. The total quality (G factors) was also obtained in acceptable range as shown in Table 7 (acceptable values of G-factor in Procheck are between 0 and −0.5 with the best model displaying values close to zero) [60] indicated the designed models are of good quality and acceptable.

Fig. 3
figure 3

Ramachandran plot of chitinases of rice. (a) Chitinase1 of class I (b) chitinase11 of class II (c) chitinase5 of class IV

Table 7 Ramachandran plot statistics and overall G- factors

The packing quality of each residue as assessed by the Verify_3D program represents the profile obtained with respect to the residues. Compatibility of the model residues with their environment is assessed by a score function. Residues with a score over 0.2 should be considered reliable. Score for all refined models maximally lies above 0.2 which corresponds to acceptable side chain environment as represented in Table 6.

Energy profiles of models were obtained using ProSA score (Fig. 4). ProSA revealed a Z-score (a measure of quality of model as it measures the total energy of the structures) value of the model which lies between −5.0 to −7.5 (negative value imply model accuracy) as depicted in Table 6. The degree of structure similarity was measured using root-mean-square distance (RMSD) between equivalent atom pairs. To investigate how well the modelled structure matches the X-ray data of template, the prepared models and their respective templates were superimposed on their backbone atoms. RMSD values of the backbone atoms for all models tabulated in Table 6 supported that generated models are reasonably good and quite similar to template. RMSD analysis of the chitinase models was measured from its templates using iPBA web server. The coordinates of the models are deposited in Protein Model DataBase (PMDB) and can be accessed at http://mi.caspur.it/PMDB using PMDB ID: PM0077946-954. Refined and validated models are shown in Figs. 5, 6 and 7 respectively.

Fig. 4
figure 4

ProSA energy profile of rice chitinase. (a) Chitinase1 of class I (b) chitinase11 of class II (c) chitinase5 of class IV

Fig. 5
figure 5

Homology model of chitinase1 from rice. (a) Solid ribbon representation of the chitin binding domain (CBD) coloured by their secondary structure elements and disulfide bridges between cysteine residues are labelled (b) Solid ribbon representation of the catalytic domain (CatD) coloured by their secondary structure elements. Disulfide bridges between cysteine residues and conserved residues are labelled

Fig. 6
figure 6

Solid ribbon representation of the catalytic domain (CatD) of chitinase11 coloured by its secondary structure elements. Disulfide bridges between cysteine residues and conserved residues are labelled

Fig. 7
figure 7

Solid ribbon representation of the CBD (a) and CatD (b) of chitinase5 coloured by its secondary structure elements. Disulfide bridges between cysteine residues and conserved residues of CatD are labelled

The structures of each chitinases belonging to three different classes were analysed extensively. The predicted structure of class I chitinases (chitinase1, 3, 7, 9 and 12) of rice revealed that they are composed of two discrete domains which are interlinked by hinge region rich in proline and glycine with an exception of chitinase10, which do not possess the N-terminal CBD. This correlates with the comparative sequence analysis. The secondary structure elements predicted by different servers correlate significantly with the results from STRIDE [71], where stride recognises secondary structural elements in protein from their atomic coordinates. The two discrete domains (CBD and CatD) were built independently and were analysed.

The 3-D structure of CBD of chitinase1 was modelled by considering the CBD of its nearest neighbour 2DKV (chitinase2) as template. The predicted structure of chitinase1 belonging to class I chitinase revealed that the CBD is composed of a 310 helix, an α-helix and a two stranded anti-parallel β-sheets ((Cys37-Ser39) and (Asn43-Gly45)) which are connected by a turn composed of three residues (Gln40-Gly42). The CBD is connected to CBD by flexible linker sequence rich in glycine and proline residues. A total of three disulfide bridges (Cys23/Cys38, Cys32/Cys44 and Cys35/Cys51) maintain the shape and stability of the fold which are conserved in related sequences of class I family. The aromatic residues tyrosine and tryptophan are highly conserved which are mostly clustered on one face of the protein. The CBD also possess cysteine and glycine residues at conserved positions. The conserved cysteine residues forming disulfide bridges are thought to maintain the stability of the structure which may help the enzyme in extracellular activities and conserved glycine residues may help correct folding of the chitinase.

The CatD of chitinase1 was modelled by considering CatD of mustard chitinase (2Z39) as template. The CatD of chitinase1 of rice is dominated by α-helix, comprised of 12 helices and five disulfide bridges. The structure resembles that of CatD of mustard and barley chitinases (GH family 19). The superposition (overlapping of Cα atoms) of CatD with that of corresponding domain of 2Z39 reveals the spatial position and orientation of the helices are highly conserved (Fig. 8b). CatD of chitinase1 has a triad consisting of Glu144, Glu282 along with Arg294 which corresponds to the catalytic triad of its template 2Z39 (Glu212, Glu349 along with Arg361). Ubhayasekera et al. [72, 73] have reported Arg361 and Glu349 residues are essential for catalysis and works together along with Glu212 in the form of a catalytic triad in 2Z39. As evidenced from the structural alignment between chitinase1 with 2Z39, the key elements of secondary structure are strongly conserved (secondary structure elements identity of 93.8 %). From Fig. 8b, it is evident that the structure of CatD possesses five loops namely I, II, III, IV and V, which are the most important striking features of class-I and II family 19 chitinases. Loops I, II and V are missing in bacterial chitinases [74]. The missing loop information in family 19 reflects that loop III which is common in all classes of chitinase might play an important role and probably has a significant function. The structure of chitinase 3 is exactly the same as chtitinase1 whereas structures of chitinase7, 9, 10 and 12 almost resemble that of chitinase1 and 3. There is a small difference in their number of α-helices and the disulfide bridges which help in maintaining the stability of the folds present in chitinase of rice.

Fig. 8
figure 8

Structural superposition of chitinase1 with its template. (a) Superposition of CBD with 2DKV along with their secondary structure alignment [* Sec A: Chitinase1, Sec B: 2DKV]. CBD of chitinase1 is coloured in green and CBD of template is coloured in yellow. (b) Superposition of CatD with 2Z39 along with their secondary structure alignment [* Sec A: Chitinase1 Sec B: 2Z39]. CatD of chitinase1 is coloured in green and CatD of template is coloured in yellow. The loops are coloured in red. The key conserved residues are highlighted in red square box

Chitinase11 belonging to class II of GH family 19 reveals that the structure possess only the CatD and lacks the CBD. The CatD of chitinase11 is almost the same as that of class I chitinase and is dominated by α helices. The two disulfide bridges (Cys48/Cys109 and Cys214/Cys247) maintain overall stability of the bilobed structure. The loops (I, II, III, IV and V), which are also common in class I chitinases are located in CatD of chitinase11. As evidenced from the structural alignment between chitinase11 with its template 2BAA (Figs. 9 and 10) the key elements of secondary structure are strongly conserved (secondary structure elements identity of 96.9 %). The key catalytic residues of template (2BAA), Glu67 super imposes with Glu91, Glu89 with Glu113 and (Tyr123/Tyr133, Asn124/Asn134, Gln118/Gln128, Gln162/Gln172, Lys165/Lys175, Pro163/Pro173 and Asn199/Asn209) of chtinase11, and are fully conserved. The structure of chitinase8 is also the same as that of chtiinase11, the only difference being observed in their number of alpha helices (12 in the case of chitinase11 and 13 in the case of chitinase8).

Fig. 9
figure 9

Structural superposition of chitinase11 with its template 2BAA along with their secondary structure alignment [* Sec A: Chitinase11 Sec B: 2BAA]. CBD of chitinase11 is coloured in green and CBD of template is coloured in yellow. The loops are coloured in red. The key conserved residues are highlighted in red square box

Fig. 10
figure 10

Structural superposition of chitinase5 with its template. (a) Superposition of CBD with 2UVO along with their secondary structure alignment [* Sec A: Chitinase5, Sec B: 2UVO]. CBD of chitinase5 is coloured in green and CBD of template is coloured in yellow. (b) Superposition of CatD with 3HBD along with their secondary structure alignment [* Sec A: Chitinase5 Sec B: 3HBD]. CatD of chitinase5 is coloured in green and CatD of template is coloured in yellow. The loops are coloured in red. The key conserved residues are highlighted in red square box

Chtinase5 belonging to class IV chitinase, revealed that it is somewhat different from class I and II chitinases and dominated by α-helices. It is also seen that it has fewer disulfide bridges within the structure in comparison to class I and II chitinases. The two domains (CBD and CatD) possessed by chitinase5 were modelled independently with their closest neighbouring structure as template. The CBD of chitinase5 was built by homology modelling using the closest homologous structure, wheat germ agglutinin complex with n-acetyl-d-glucosamine (PDB ID: 2UVO B chain) as template. The CBD of this chitinase is globular in shape, very small in size and has irregular β-sheets with two strands. It is connected to the catalytic module by a flexible linker. The surface aromatic residues which are concentrated on the surface of the protein are highly conserved.

The CatD of chitinase5 was modelled by considering CatD of Norway spruce (Picea abis) (PDB ID: 3HBD) as template. The CatD of chitinase5 is composed of mainly α helices. The CatD of chitinase5 has loops I and III as in the case of class I and II but lacks loops II, IV and V. This is one of the important striking features of this class of enzyme. The structural superposition CatD of chitinase5 with 3HBD showed that the secondary structure elements are superposed well (secondary structure elements identity of 93.0 %) even when the sequence identity is 59.0 % only. Comparison with the template also revealed that general acid Glu113 of 3HBD is fully conserved with Glu151 of chitinase5 model as well as Glu218/Glu255, Arg230/Arg267 strongly conserved. Ubhayasekera et al. reported that in the CatD of 3HBD Glu113, Arg230 and Glu218 forms a triad. Triad formation by Glu151, Arg267 and Glu255 residues in chitinase5 confirms the functionality of the protein might be the same as that of 3HBD.

3d2GO server was used to predict the gene ontology (GO) terms for chitinase protein. The results from this server showed that the modelled chitinase proteins are predominantly associated with different cellular process, i.e. cell wall macromolecules catabolic process, chitin catabolic process, chitinase activity, hydrolase activity, hydrolysing O-glycosyl compounds, hydrolase activity acting on glycosyl bond with a confidence value >0.85 whereas model of chitinase12 showed GO terms associated with polysaccharide binding, pattern binding and carbohydrate binding with a lower confidence value (<0.25). The multiplicity of function in different classes of chitinase predicted by 3d2GO server is perfectly justified by structural variation in all forms of chitinases in rice.

Structural pockets and cavities are often associated with the binding sites and active sites of proteins respectively [56]. The active sites and the amino acids involved in the formation of the large cavity with their surface areas and volumes are depicted in Table 8. Most of the residues that form the large cavities are hydrophilic and charged in nature, which are also conserved in all forms of chitinase in rice. The study of active sites reveals that the catalytic domain possesses few aromatic amino acids. Therefore CatD is dominated by hydrogen bonding in contrast to other interactions.

Table 8 The active sites and the amino acids involved in the formation of the large cavity in chitinases with their surface areas and volumes predicted by CastP server

Conclusions

Chitinases are prime molecules of interest of plant pathologists and can be used in a variety of ways to improve plant health. These enzymes are not only involved in plant resistance to external environmental factors by generating signal molecules but also in plant growth and development. They are classified into various types on the basis of the structural and functional properties. The structure and function study gives a brief outline about enzyme 3-D structure and displays how the secondary structure elements are arranged in the protein. Rice chitinases are single or multi domain and multi functional enzymes. The evolutionary analysis revealed how the enzyme evolved during the process of evolution and its relatedness among other plant chitinases. The structural similarity as well as their differences among three different classes of chitinases (I, II and IV) from rice is reported in this work. The different class of chitinases belonging to GH family 19 possess highly α-helical and bilobed structures in nature. The superposition of all the classes of chitinases along with their closest homologous templates reflects the secondary structure elements are strongly conserved. The highly conserved residues are catalytic in nature, help in substrate binding as well in disulfide bridge formation. One of the most important striking features of the three different classes of chitinase of rice is that the CatD possesses a catalytic triad which is thought to be involved in catalytic process common in all forms. As far as the loops formed within the CatD are concerned, class I and II chitinases of rice possess loop I, II, III, IV and V whereas loop II, IV and V are missing in class IV chitinase. Loop III which is common in all classes of chitinases might play an important role in their respective function. Our study also confirms that the absence and presence of different loops in GH family 19 of rice may be responsible for various sized products as previously reported by Mizuno et al. and Fukamizo et al. [75, 76]. So it can be concluded, that the sequence variation in different forms of chitinase might lead to the structural variation which reflects in terms of multiple functions which also fits with the prediction made by 3d2GO server for function prediction. The study of active sites reveals that the CatD is dominated by hydrogen bonding as only a few aromatic amino acids lie in the active sites for interaction. More structural study of this enzyme from different plants may enhance the knowledge of catalytic mechanism and substrate binding.