Introduction

All kinds of biological functions including enzymatic reactions, biochemical reactions and immunological reactions, co-ordination of nerve impulses, transport and storage are carried out by different types of proteins. Proteins are present in primary, secondary, tertiary and quaternary structures that are maintained by bonds and interactions among its amino acids. The folding of protein depends upon the sequence of amino acids. Proteins carry out biological functions in vivo and in vitro conditions. Protein structure is an important key to understand its role in various biological functions. Proteins are encoded by genes which determine the sequence of amino acids in them.

The sequential process of primary assembly of amino acid leads to proper interaction of the amino acids forming the secondary structure which finally minimizes the energy within the protein, as folding of the protein into a stable conformation produces a stable 3D protein. This 3D structure provides stability to protein for it to be functionally active. All the forces which act in accordance with each other as the hydrogen bonding, hydrophobic interaction, and the negative and positive forces along with conformational entropy decide the protein stability. The correct orientation due to proper interaction between the specific amino acid in form of appropriate amino acid sequence will produce a stable conformation and functionally active protein.

The function of protein such as ligand or substrate binding, catalysis of any reaction and post-translational modification is affected by addition or deletion of nucleotide(s). However, to which extent this affect the structure and function of a protein, varies. According to Tokuriki and co-workers [1] mutations can be classified into two types: (1) New function mutations and (2) Other mutations. New function mutations give the mutated protein with new activities or selectivity while other mutations produce mutant protein without any functional change.

Mutations of amino acids in protein may be present either in the core or on the surface in buried or exposed forms. Mutations occurring at the surface of a domain are usually considered as neutral, but may affect the binding affinities while those in the core may alter the stability of the domain fold. However, mutations at the interface of protein complexes are called hot spot which are associated with diseases [2]. The mutations which are in the residues at active site or at any other residue at the distant site affect the structure of enzyme, substrate and cofactor binding site. Non-synonymous Single Nucleotide Polymorphisms (nsSNPs) and mutations can produce a variety of adverse effect on proteins like changing genotype and phenotype of any protein which may cause diseases like cancer. In contrast to this negative effect [1, 3, 4], many mutations exert a positive effect on the stability, function and structure of proteins. Some mutations are silent and do not exert any effect on protein function, but increase the stability of proteins. The changes in the protein stability are associated with single or multiple amino acid substitution(s). The detailed knowledge of the effect of mutation can be employed for constructing proteins of interest by protein engineering techniques and in disease treatment/management. Mutations that confer stability can be used for identifying a wide spectrum of drug resistance mechanisms. Aforesaid description illustrates that mutations and SNPs produce a variety of effects that may be advantageous or disadvantageous for the living organism. Therefore, it is utmost important to study the relationship between protein function and stability in order to understand evolutionary dynamics of protein. Besides this, an insight of nsSNPs and mutations can be helpful in diagnosing diseases at an early stage, prognosis, prevention and treatment of diseases. It is also helpful in understanding the aspects of protein engineering for designing novel protein with enhanced stability and activity in the laboratory.

Mutations on another aspect not just provide us an understanding of the novel protein designing with enhanced stability and activity, but it also gives us a platform where we can design mutants with a strong understanding about the importance of a particular amino acid at specific location, having a stability effect. The stability in this context leads to developing a thermo-tolerant and pH stable protein. The mutation of the protein is thus targeted as per our need; designing of high temperature tolerant enzymes, alkali stable or stable in acidic environment. All this can be performed by recognizing the most mutation tolerant residues in the protein, without affecting the overall conformation and 3D structure of the designed protein.

In this review, light is shed on the effect of mutation on protein stability for its functionality and enhancing thermal stability and pH tolerance. Protein stability is defined as the active conformation of the protein which is functional and performs its specific role as enzyme, substrate or receptor. The computational approaches and their basis of prediction are also discussed in this paper.

Stability of protein: a major concern

Stability of protein is a major concern because only stable proteins can perform their function efficiently. Stability of protein or enzyme depends on the organization of residues at active site which mainly involves two types of residues: (1) functional residues and (2) key catalytic residues. Generally, functional residues are polar or charged, embedded in hydrophobic cleft however, key catalytic residues possess unfavorable backbone angles [1].

Many researchers have reported significant loss of protein stability when mutations are related to gain of function [1]. Mutation in key positions of protein leads to the evolution of new protein function at the expense of stability. In contrast to this, mutations may sometimes evolve a more stable protein with loss of function compared to the native one [5]. Since most mutations lead to have destabilizing effect on proteins conferring new or altered protein function, these mutations must be studied. The study of impact of mutation on protein stability provides the details of mechanism of protein folding and identifies the role of specific residues in function and stability [6, 7]. This information can be used for designing new proteins with desired characteristics such as specific levels of enzymatic activities and stability by introducing point mutations using site directed mutagenesis and random mutagenesis. Introducing point mutations by these methods in protein requires financial investments, time, resources and labor. Second important aspect of study of stability, induced by point mutation is—to find out the new mutated stable protein, which has a wide spectrum of drug resistance among the unstable proteins [8, 9].

Analysis of stability

Protein stability is maintained by various non-covalent interactions such as hydrophobic, electrostatic, van der Waals and hydrogen bonds [1012]. These interactions are of incalculable value for the analysis and prediction of protein stability. Several methods have been developed for assessing and predicting the factors affecting the stability of protein upon mutation or SNPs. These methods also provide a basis for discriminating mutated stable proteins from native unstable proteins, disease causing mutations from non-disease causing mutations and developing novel enzymes with improved function and stability.

Several methods have been developed over the years for predicting the factors influencing the stability of mutant proteins even upon single amino acid substitution [1320]. Xencor developed techniques for high throughput generation of myriad sequence variants, coupled with computational protein design automation, for cytokine and growth factor protein therapeutic, and later antibody, protein stability improvement [2]. Traditional methods have several limitations which are overcome by computational approaches based on sequence, structure and energy features. These methods can provide better prediction of stability if used in combined form rather than used alone.

Stability predicting features of computational tools

Computational tools work on algorithms (set of rules) which are based on following predictive features:

  1. (a)

    Structural features: Hydrophobic area, packing and folding of protein, backbone angles, electrostatic interactions are some of the important features that can be used for stability prediction [5].

  2. (b)

    Sequence features: This is based on conserved sequences and amino acid position. The impact on protein viability can be assessed. However, it provides no direct insight into the underlying mechanism [5].

  3. (c)

    Combination of structural and sequence features: In this approach, all the above mentioned features can be used together to predict stability.

  4. (d)

    Energy features: Energy features are important for assessing the stability of protein. The energy of unfolding of the target protein is the sum of various energies such as Van der Waals interaction, solvation energy, extra—stabilizing free energy, etc. [5].

  5. (e)

    Molecular features: Solvent accessible surface area of the interface and hydrophobic and hydrophilic area is used for stability prediction.

Machine learning approaches are based on the study of structure from data. All these features were used for the development of machine learning approaches such as support vector machines (SVM), neural network and decision tree, by the incorporation of functional effects. These methods are designed to predict a change in single amino acid substitution using secondary structure, surface accessibility and sequence attributes. The objective of these methods is—to identify and use non-redundant features that are required for accurate classification [21].

Steps for prediction of mutant protein stability by computational approaches

There are three main steps for predicting the mutant protein stability mentioned below:

  1. (i)

    Development of a database for proteins and mutants: Various stability predicting tools require different types of databases. These databases provide a template whose structure and all details are known. This template is used to compare the structure and stability of query sequence. These databases are given in Table 1.

    Table 1 Different databases provide information related to mutant protein
  2. (ii)

    Understanding the factors influencing protein mutant stability: The comparison of various structural and sequence features provide better understanding of the factors affecting stability of mutant protein. The detailed insight of these factors can give direction to solve problems related to protein stability.

  3. (iii)

    Prediction of protein stability upon mutation: This can be done by the help of different computational approaches and tools. Many tools which are available as web based tools or standalone tools are mentioned in Table 2.

    Table 2 This table depicts different tools with their salient features and respective websites for accession

The information gathered by these tools can be used for stability prediction and further for designing of new proteins and diseases caused by non-synonymous mutations.

Computational approaches

In the last decade, many tools have been developed to predict the effect of Single Nucleotide Polymorphisms (SNPs) and mutation on genomic location (coding, non-coding and regulatory sequences) and on translated protein (synonymous and non-synonymous SNPs) which was considered as unsolved problems earlier. SNPs are of three types, i.e., neutral, fully disruptive and partially disruptive. The prediction of neutral, fully disruptive SNPs is relatively easy. However, prediction of SNPs that produce intermediate phenotypic effects is a great challenge. To overcome this problem, computational tools such as PolyPhen, HOPE, SNPeffect and many other tools have been developed [41, 42]. All SNPs and mutations are not associated with the origin of disease. It is of utmost importance to discriminate disease associated mutations with non-disease mutations. This can also be done by computational tools such as PoPMuSiC-2.0, Site Directed Mutator, Mutation assessor, PhD-SNP and PANTHER etc. All the computational tools can be categorized into four broad categories: (1) Structural features, (2) Sequence features, (3) Energy parameters and (4) Combined features. In this review, we have also focused on the use of consensus tools for predicting stability.

Structure based approaches/tools

These tools predict the stability changes by observing structural properties such as secondary structure and accessible surface area of mutated residue [20]. PolyPhen tool was developed to assess intermediate phenotypic effects of point mutation. Similar to HOPE, this tool predicts the protein structure by statistical analysis, but unable to provide any information on the amount of free energy changes on point mutation and therefore, cannot be used for the correct prediction of stability. This limitation is tackled by the development of SNPeffect tool. It is a structure based tool that uses FoldX; force field for quantitative estimation of free energy and thus, gives accurate information about the protein stability. However, it has some limitations related to the quality of structure without which protein structure cannot be modelled with more than 90 % sequence identity to that of template structure. This tool also helps in finding out the protein homeostasis landscape i.e. the amount of proteins which must be present in various cellular compartments of cell [54]. Further modification in this tool has led to the development of latest version i.e. SNPeffect 4.0. It is integrated meta-analysis tool that is designed by the integration of FoldX for predicting protein misfold, TANGO and WALTZ for protein aggregation, LIMBO for chaperon interaction. It enables the study of large scale data mining and graphical representation of data. It provides detailed information on functional sites, structural features and post-translational modification of protein [55]. It can also be applied for molecular characterization and presentation of disease linked polymorphism in humans owing to the database for phenotypes of human single nucleotide polymorphisms (SNPs).

To analyze the effects of nsSNPs on phenotypic characteristics nsSNP Analyzer, a web based tool, was developed which provides detailed information related to the effect of SNP on structure of protein, surface accessibility, environment and multiple sequence alignment. This tool facilitates the identification of disease-associated nsSNPs from neutral nsSNPs. This software uses a machine learning method called Random Forest for prediction and requires structural and evolutionary information from a query nsSNP [33].

CUPSAT (Cologne University Protein Stability Analysis Tool) was developed to predict changes in stability of protein upon point mutation with good efficiency [8]. It provides information about the site of mutation and structural attributes such as solvent accessibility, secondary structure and torsion angles affected by mutation.

The stability is predicted by calculating difference in free energy of unfolding between wild type and mutated protein. As a rule this tool requires protein structure in Protein Data Bank (PDB) format and the location of the residue to be mutated.

Auto-Mute tool is also a computational approach. Like SDM, it uses knowledge based potential and machine learning approach to predict the function and stability of mutant proteins according to the score generated. However, SDM was proved to be better than Auto-Mute. Like CUPSAT, this tool also requires the data in PDB format [44].

Structure-based tools have a drawback. These tools cannot be used if structure of protein in crystallized, 3-dimentional and high resolution form is unavailable. All methods need structure only in PDB format. However, mutation may occur at genome (instead of proteome) level, which can change the structure and other properties of mutated protein. This limitation can be resolved by using sequence based prediction tools. Nowadays, huge amount of data is generated from various genome sequencing projects which is utilized for making libraries or databases and further, applied for comparison of query sequence with the target sequence.

Sequence based approaches/tools

Sequences, which code for functional region of protein, are considered as conserved regions and not useful for predicting the structure of mutated protein. Therefore, sequence based methods rely on evolutionary conservation of homologous protein sequences [56].

The most simple, easy to handle tool is HOPE (Homotopy optimization method) because the results are depicted in the form of animated videos and figures which provide an easy basis to discriminate between neutral and deleterious mutations by either trained or new user. It is used for the prediction of intermediate phenotypic effects caused by single amino acid change. It is a sequence based tool to facilitate calculation of potential energy associated with a protein model. Any user of online web server can submit a sequence and mutation. It can also collect information about the protein from various sources (3D protein structure, UniProt Database and DAS servers). It requires a known protein with minimum energy conformation which acts as a template and used to predict the structure of query sequence on statistical basis. The comparison of query sequence with the template provides information about the effect of mutation on protein structure, function and stability [41]. Further, it builds a model of mutated protein if the amount of identity between query sequence and PDB file exceeds the threshold value.

Another tool for the prediction of SNPs or mutation due to single amino acid substitution is SIFT (Sorting Intolerant from Tolerant) which was developed by Ng and Henikoff [57]. It is a tool based solely on sequence homology which discriminate neutral or deleterious SNPs using normalized probability score (score < 0.05: deleterious and score > 0.05: neutral) [57]. It collects the sequences from PSI-BLAST and consider only closely related sequences. The basis of closely related sequences is the conservation in the conserved region. This tool is very sensitive and has been applied to human variant databases in order to find out deleterious mutations (disease causing) among neutral SNPs [58]. Like PolyPhen, this is used to predict loss of function mutations. It is also applicable to predict mutations induced by in silico procedures. The most important advantage of this tool is that it can predict a large number of substitutions which provides higher number of affected phenotypes. Moreover, application of this tool during the in silico analysis can reduce the number of assays required for screening of desired mutants. The main limitation of this technique is—it gives generalized probability score rather than specific for the mutant protein of interest. It has very low specificity and therefore, results must be interpreted carefully [59]. This tool will not provide molecular mechanism or details to understand the cause of disease [2].

PhD-SNP (Predictor of human Deleterious Single Nucleotide Polymorphism) tool is also a sequence based approach for the prediction of mutant protein stability caused by single nucleotide polymorphism. It is basically a Support Vector Machine (SVM) based classifier, specifically designed for human dataset associated with disease causing mutations. It collects the data from Swiss-Prot database for input and results in the generation of protein sequences and profile information about the mutant protein [35]. PhD-SNP server provides datasets for Parepro, a computational tool, based on SVM. Parepro requires evolutionary information instead of structural information to predict the effect of nsSNP with higher accuracy than any other method [60].

Another sequence based tool is MUpro that can accurately predict the stability of protein using primary sequence information only by machine learning approaches. Enhanced accuracy of this tool is reported if structure is also provided, although it is not necessary [61].

Mutation Assessor also calculates stability score that predicts the impact of mutation on protein. It relies on evolutionary conservation approach and uses multiple sequence alignment (MSA) to generate conservation score, which classifies the mutation into three categories; neutral, low, high or medium. All these mutations affect the functionality of protein and their stability. This tool was also used for SNPs associated with diseases such as cancer. It has its own database for the prediction of diseases; however, it can retrieve data from other databases such as COSMIC, UniProt and Pfam. This is available only as a web based tool.

Like Mutation Assessor, PANTHER (Protein Analysis Through Evolutionary Relationships) is also one of the tools that are based on position specific evolutionary conservation score to find out the relation of any mutation with disease and its impact on protein function and stability. The score decides the type of mutation; if score range falls in between 0 (neutral) to 10 (deleterious). It applies hidden Markov models (HMMs) for aligning protein sequences and provides an idea about the variants in protein families and subfamilies. This tool can also be used for designing of novel stable proteins.

I-Mutant (old version of I-Mutant 2.0) implements a neural network algorithm for predicting the stability change caused by single nucleotide substitution due to point mutation. It is a support vector machine based tool which only requires the input of sequence and therefore, can be used even if crystal structure of protein is not known [16].

A web based tool PoPMuSiC (Prediction of Protein Mutant Stability Changes) can also be used for predicting all mutations irrespective of their effect on stability. It is a neural network based approach, which predicts the mutant protein stability on the basis of sequence. Compared to the other methods, it is a very fast technique which requires only few minutes of time for predicting stability changes of any protein possessing single-site mutations. It is user friendly and easy to exploit [62, 63] for computer-aided designing of mutant proteins.

Similar to PoPMuSiC, SNAP (Screening for Non-Acceptable Polymorphisms) is also a neural-network based tool which collects information solely from sequence to study effect of mutation on protein structure and function. This computational approach can predict secondary structure, solvent accessibility and other information related to protein structure in addition to evolutionary information. This enables the discrimination of gain of function mutations from loss of function mutations. It can be used in designing of protein. This tool predicts effects of mutation by score/reliability index for each substitution with improved accuracy. It can incorporate mutations at the position of user’s choice [64]. However, the most advanced version of PoPMuSiC i.e., PoPMuSiC 2.1 can be used to find out interesting sites for introducing mutation. Therefore, it can be suitably used for protein designing. Unlike the previous version, PoPMuSiC 2.1 can predict only those region or sequences that correspond to structural and functional weakness [65]. This property of tool is based on the use of evolutionary information. It requires only sequence input, uses machine learning approaches for processing data, and predicts the direction and value of stability.

The entire above mentioned tools can predict single amino acid substitution in the sequences only. There is only one computational tool PROVEAN (PROtein Variation Effect ANalyser) that not only predicts single/multiple amino acid substitutions, but also predicts single/multiple insertions and deletions [49]. This tool applies a new metric measure i.e., alignment-based score to predict the change caused by mutation. At first, this tool collects all possible supporting sequence using search tool BLAST, which corresponds to input protein sequence. Then, delta alignment scores are computed with respect to each sequence variation which is further used for calculating PROVEAN score (50). It provides more accurate results for mutant proteins compared to Mutation assessor, SIFT and PolyPhen-2 [49]. Castellana and Mazza [66] used this tool for the classification of nsSNPs by using SNP-associated chromosomal positions as an input protein sequence.

Annotation of SNPs is difficult in the species lacking a reference genome. This problem is solved by a newly developed tool known as SNPMeta. This tool collects information about SNPs by comparing sequences with the sequences present in GenBank databases. Results, obtained from this tool, are incomparable with that obtained from a reference genome [67].

Energy based approaches/tools

Different methods have been developed and implemented in order to predict the stability of protein with.respect to wild type protein. Some computational tools are based on energy functions to compute the free energy changes related to stability of proteins. The physical and statistical energy based approaches provide good results qualitatively, however, do not provide precise values and cannot be applied to large datasets. In contrast to this, empirical potential approaches provide rapid results with precise values to evaluate the contribution of an amino acid substitution to the stability of protein [68].

Approaches based on the binding free energy cannot be applied to predict core mutations due to biophysical nature, polar and electrostatic attractions of protein–protein interfaces. Therefore, only dissociation rate of protein instead of association rate is used to predict difference in energy of mutant protein and native protein. This was done by alanine scanning method that is based on empirical energy functions. The main advantage of this method is—its applicability to predict the effect and molecular effect of multiple amino acid substitution. It provides more accurate prediction compared to sequence based tools.

PoPMuSiC is a sequence based tool. However, PoPMuSiC 2.0, a latest version of this tool, works on energy based function and measures the change in protein upon single nucleotide substitution. It is statistical potential based method that has also been used to characterize in silico the effect of mutation on stability of protein. Besides this, it also predicts mutation responsible for hereditary diseases; acquired drug resistance and natural heterogeneity of a viral protein [65]. This tool can estimate stability changes in medium size protein within seconds and robustness of structure. It depicts highest linear correlation (0.63) between predicted and measured stability values.

After PoPMuSiC algorithm, Site Directed Mutator (SDM) has the highest correlation between predicted and measured stability values [65]. It is better than Auto-Mute tool. SDM is also knowledge based approach that predicts the mutant protein stability associated with disease development and engineering protein. It is developed by Topham and colleagues in 1997 [69]. It works on the statistical potential energy function to predict the stability score (negative score for destabilizing mutation and positive score for stabilizing mutations). This score is not only useful in disease prediction, but also in protein engineering. The main problem associated with this tool is its least bias in predicting stabilizing and destabilizing mutation. The performance of SDM is therefore, improved only when highly stabilizing and destabilizing mutations are considered.

Combined features based tools

In the last decade, many in silico approaches based on computational tools have been developed to predict mutant protein stability on the basis of structure, sequence and energy based features. As discussed earlier, structural based approaches can be applied only when structure is known otherwise sequence based approaches are suggested. However, prediction accuracy of sequence based approaches is lower than the structure based approaches [38]. None of them is proven to be accurate and provide complete mutant protein analysis. The prediction accuracy can be increased by using the right combination of features [70]. Therefore, combined and consensus approaches are developed with increased accuracy and efficiency and for use in all situations [52]. Mutations in the core are difficult to predict by using one method and therefore, sequence, structural and energetic features based methods can be used in a suitable combination by machine learning methods for prediction.

I-Mutant tool (mentioned earlier) is modified and developed into its latest version, known as I-Mutant 2.0, to be used either with sequence or structure. It is based on SVM and support vector regression [71] automatic prediction of stability upon single nucleotide substitution. The accuracy of prediction depends upon the sequence and structural information provided to it. Besides predicting value and direction of stability, it can be used for predicting point mutation associated diseases in humans. It can also be useful in protein engineering. Unlike the other tools, it requires the input of data in raw format which is a unique feature of the tool [71].

Another tool based on sequence, structural and phylogenetic features is PolyPhen-2. It is automatic tool and an advanced version of PolyPhen used to predict the impact of amino-acid substitution on the structure and function of human protein by machine learning approaches. It requires data from human protein database (UniProt KB/swiss-Prot), known 3D structure or homologous proteins (if known structure is unavailable) to predict amino acid replacement in the core of protein corresponding to the known structure. Similar to SIFT, it can also accept FASTA protein sequences. It is a Bayesian classifier used to categorize pathogenicity [66]. It also provides information on functional impact of SNPs by using input from F-SNP database (mentioned in Table 1). The best feature of this tool is its built-in support system for high performance computing, which makes it suitable for handling huge amount of data generated by next generation sequencing projects.

EvoD tool [72] has better accuracy compared to PolyPhen-2 and Condel. In this tool, multiple sequences submitted as NCBI RefSeq. Protein IDs are used to predict the nucleotide substitution affected sites on the basis of biochemical and evolutionary properties. This tool distinguishes neutral mutations from deleterious mutations.

An integrated tool, I-Stable is also developed based on both structural and sequential information. It is SVM based tool that not only provides information regarding protein stability, but also provide accurate predictions for secondary structures, relative solvent accessibility and classification of protein into super families. It can also be used for the designing and engineering of protein.

Recently, Berliner and colleagues [2] developed a Stability Meta-Predictor for predicting core and domain–domain interface mutations by integrating sequence and structural features. This novel tool is known as ELASPIC (Ensemble Learning Approach for Stability Prediction of Interface and Core mutations). To increase the accuracy of prediction, it uses Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm which combines both sequence and structural features. This tool not only predicts stability and affinity of mutant protein, but also reveals the molecular principles behind disease-causing mutations. It decorously discriminates disease-causing mutations from neutral mutations.

Consensus tools

Consensus tools are based on the integration of various tools that compares sequences of homologous protein using multiple alignment process. This comparison gives a consensus sequence which further compares with existing protein sequences in order to predict the differences generated by point mutations or nsSNPs. This offers the selection of best mutation related to increasing stability of proteins. In this way, consensus tools provide information, that need to design a novel protein with desired characteristic and stability.

PON-P is a machine learning-based method that requires the submissions of sequences in multi-FASTA protein sequences. It collects input data from SIFT, PhD-SNP, PolyPhen-2.0 and SNAP tools for analysis. Therefore, it delivers the results predicted by all the aforesaid programs due to which these outputs can be compared effortlessly [66].

Condel is also a consensus tool which is developed by integrating SIFT, PolyPhen-2 and MutationAssessor. It collects, assembles and present the results obtained by these tools. This can be used as a web server or standalone tool (run on computer after downloading) [47].

Another consensus tool PredictSNP collects the input data from six tools such as MutPred, Polyphen-1, Polyphen-2, SNAP, MAPP, PhD-SNP, SIFT, SNPs&GO for predicting the effect of single amino acid substitution. Most of these tools are based on machine learning methods, especially designed to classify neutral and deleterious mutations on the basis of physicochemical, sequence and structural parameters. PredictSNP provides a consensus prediction with improved accuracy and efficiency over individual integrated tools [73].

DUET is a valuable integrated tool which uses sequence based approaches. DUET integrates mCSM and SDM in a consensus tool and analyzes the results using SVM. It is used to predict mutant protein stability, which can be applied for protein engineering approaches and anticipation of disease. It predicts the structure by multiple sequence alignment, which can be done automatically by this tool or manually by user [74]. The availability of sequences may affect prediction accuracy of tool. It can be applied for the prediction of nsSNP of human and non-human genomes. Besides, it helps in engineering novel protein with enhanced stability [52].

These consensus tools improve accuracy and efficiency of the tools and predict the results better than any other tool. These tools provide a platform for comparing the results obtained by different tools in a common place and hence, reduce the efforts required to study the data by using different tools individually.

Tools for study of buried residues

Aforementioned tools can be used to study the change in the core, especially to provide information about the interaction of exposed residues only. The amino acid changes occurring in the buried residues of core are difficult to predict. Recently, NeEMO tool is developed for the study of stability changes on the basis of residue interaction network (RINs). RINs are very comprehensive data structures that help the management of heterogeneous data, such as evolutionary and topological data. RINs are used to describe interaction of mutant amino acid with its structural environment. It is very effective and provides an accurate prediction of the buried residues that are difficult to predict by any other method. Unlike other methods, it does not require the modelling of mutant protein structure and therefore, not only avoid the errors introduction but also makes it computationally economic. The NeEMO web server can be freely accessed from URL: http://protein.bio.unipd.it/neemo/. It requires PDB files of PoPMuSiC 2.0 dataset as an input. It can be used for protein engineering and study of diseases caused by unstable mutant protein [75].

Conclusion

Stability of proteins is extremely valuable for the study of diseases caused by unstable proteins and for engineering protein with desired characteristics. Stability is an important parameter to be considered when the effect of mutations is observed. It provides fundamental knowledge based on sequence, structure and evolutionary relationship. In the last decade, various tools have been developed for the prediction of stability. These tools provide details on nsSNP induced effects on protein structure and their molecular basis. However, these tools have their own limitations which restrict their ability of predictions. The restricted ability is due to the availability parameters which need to be understood, the tools should be used after understanding the prediction parameters which hold importance for a specific type of enhancement. Recently, some consensus tools have been developed by integrating various tools. These tools provide more reliable and accurate prediction due to compelling and comparing data from various tools. NeEMO, a novel tool, is also developed for predicting the effect of un-annotated (buried) SNPs or mutations on stability. These tools also have some limitations which restrict their use with a particular dataset. Research is going on for the development of complete tool, fit for all types of data. In future, these tools will provide more reliable and accurate assessment as the submission of new data in the databases is rapidly increasing. However, till then, for reliability of results obtained from these tools, one can further use the concept of molecular dynamics and simulation to substitute with a more confident approach for stability. This not only provides an insight in understanding stability, folding process and interactions among residues, but also depicts any specific slight changes occurring due to mutations. Tools such as MD Analysis, Desmond of Schrodinger, X-PLORE, MDTraj, WORDOM, MDWeb etc., are available to carry out molecular dynamic analysis. Mutation analysis along with molecular dynamics simulation studies do support the design and analysis of naturally occurring as well as novel designed mutants with good stability.