1 Introduction

The term in-silico refers to “computer-aided”. The term was given in 1989 as equivalence to the Latin terms in-vivo, in-vitro, and in-situ. So in-silico drug design refers to rational design through which drugs are designed or discovered via computational methods (Singh et al. 2017). It is important to note that serendipity played a vital role in the past in discovering novel drugs but the present-day trend in drug discovery has moved from discovering to designing. In-silico drug design approaches can leverage understanding of the biochemistry of diseases, pathways, identification of disease causative proteins to design compounds which are capable of modifying the characters of the proteins (Raut et al. 2015). In-silico drug design strategies include (a) Computer-based systems to get further productive drug discovery and advancement methods. (b) Creation of chemical and biological databases regarding ligands and targets/proteins to distinguish new drugs. (c) Developing in-silico methods to find out pharmacokinetic or drug-likeness characteristics for substances before screening to facilitate early identification of compounds that are going to fail in clinical phases (Barlow et al. 2009; Ferreira et al. 2015; Klabunde and Hessler 2002; Kuntz 1992). Cheminformatics Bioinformatics and are two key disciplines for in-silico drug design method that have an influence on modern drug discovery practice and accelerate drug designing. Bioinformatic techniques can help in drug target identification, validation of drug targets, protein modeling, understanding of drug targets including their evolution and phylogeny. Cheminformatic techniques can be exploited for managing storage and maintenance of information relating to chemicals and related features, identification of novel bioactive compounds, optimization of leads, in-silico ADME (Absorption, Distribution, Metabolism and Elimination) forecast and other concerns which assist in reducing the last phase failure of substances (Bleicher et al. 2003; Li 2001; Singh et al. 2017).

2 Molecular Simulations

New approaches of in-silico drug designing come in two major groups (Jorgensen 2004; Meng et al. 2011): ligand-based and receptor-based approaches. Ligand-based method comprises QSAR (quantitative structure-activity relationship), various pharmacophore assignment/mapping, database searching or mining, The structure-based methods, that comprises molecular docking and modern molecular simulations (Examples: classical molecular dynamics simulations, QM/MM molecular dynamics simulations etc), require structural information about drug targets which are available from nuclear magnetic resonance (NMR), X-ray crystallography methods, or through protein model building on the basis of homology (Jaworska et al. 2005; Senn and Thiel 2009; Verma et al. 2010). One can obtain drug target(s) of interest through (1) bioinformatics mining of different databases and repositories (Example Protein DataBank, tdr,) (2) comparative homology modeling by using software (Example MODELLER) or online resource (Examples: Swiss-Model, ModBase etc) (Borhani and Shaw 2012; Durrant and McCammon 2011; Eswar et al. 2006; Kitchen et al. 2004). Molecular simulations as it relates to drug design requires one to have compounds library from where potential drugs will be discovered and drug target(s) of interest. There are different flexible approaches to design compound libraries of interest for drug design/discovery (Fig. 28.1).

Fig. 28.1
figure 1

Schematic representation of structure-based drug design method

3 Design of Libraries

One may design library of compounds from where a potential drug will be discovered through literature mining and generation of the structures with a molecular graphics software (like ChemOffice, ChemDraw), generate electronic structures of phytochemicals from structural elucidation studies to form a library, perform structural modifications of compounds of interest with molecular graphics software, Collect compounds from different public domains, databases and repositories (Examples: ZINC Database) and filter them according to lipisink rule of fiveto get lead library or according to rule of 3 (R03) proposed by Astex to obtain fragment library. It is of interest to know that lead molecule is commonly characterized as a small molecule which has molecular weight (MW) of nearly 500 Da, and can bind its target via H-bonds with approximately five hydrogen bond donors and ten hydrogen bond acceptors, is enough with rotatable bonds to permit binding to the target, and satisfactorily lipophilic having partition coefficient (cLogP, a measure of hydrophobicity) not more than five (Buntrock 2002). General fragment libraries, invented for the purpose of screening of a wide range of targets, are different collections of compounds having high pharmacophore heterogeneity or physicochemical characteristics like molecular mass, lipophilicity etc. (Dixon et al. 2006; Guner 2000). The molecules are investigated to find out functional groups which might add to extra chemical reactivity, toxicity, and incorrect positives. Lead library fulfills “rule of five” proposed by Lipinski, that on occasion imposes understanding of the disposition properties (absorption, distribution, metabolism, and excretion, ADME) to get effective inhibitors (Lipinski 2004; Zhang and Wilkinson 2007). Through analogy to the Lipinski’s “rule of five”, molecules present in libraries accommodated to fragment-based screening follow the following rules (1) molecular weight of ≤300 Da; (2) hydrophilicity value of, clogP≤3; (3) quantity of hydrogen bond donors and acceptors ≤3; (4) quantity of rotatable bonds ≤3; and, (5) to a minor extent, molecular polar surface area of ≤60 Å2. Once the library is created, one can perform molecular docking simulations with them after validation of docking protocols (Singh and Bast 2014, 2015b, c).

4 Molecular Docking Simulation

Molecular docking simulation is an automatic computer algorithm which finds out how ligands (molecules in library of interest) will attach with their binding site in drug-target (Alonso et al. 2006). This includes determination of the alignment of the ligand, its conformational geometry, and scoring. The scoring can be free energy, binding energy, or can be a qualitative numerical measure. Each docking simulation algorithm keeps the ligand into diverse alignments and conformations at the binding site and calculates a score for each one. Two key segments in molecular docking are absolute pose prediction and accurate binding free energy estimation that can be applied to grade the order of the docking poses. The promising candidates from molecular docking simulation analysis can be processed for molecular dynamics simulations after docking (Alonso et al. 2006; Thomsen and Christensen 2006). Molecular docking methodologies are of great importance in the planning and design of new drugs. Behind the advancement of the first algorithms in the 1980s, molecular docking developed as a vital apparatus for drug discovery (Medina-Franco et al. 2011). It is applied at many stages in drug discovery like estimation of the docked structure of the ligand-receptor complex and also to grade ligands depending on their score. Docking procedures help in explanation of energetically suitable binding pose of ligands with their receptor (Iman et al. 2015). However, they are less accurate than molecular dynamics simulations which are more computationally expensive but more accurate in predicting receptor-ligand interactions. Some molecular docking simulation programs include AutoDockVina®, DOCK, AutoDock, HADDOCK, FlexX, GOLD, and GLIDE among others (Singh and Bast 2014, 2015a, c; Singh et al. 2016; Alonso et al. 2006). Molecular docking simulations are not only applicable in the study of target-ligand interactions (i.e. target-ligand docking) but can also be applied in understanding protein-protein interactions (protein-protein docking).

5 Molecular Dynamics Simulation

Molecular dynamics (MD) investigations are the time-based development of coordinates of complex molecular modules as a purpose of time. It has grown as the main method in the collection of ways to create novel bioactive molecules and could aid to logically understand their method of action and advance chemical structures with respect to biological effect. Their foremost benefit is in explicitly handling structural flexibility and entropic effects. This provides an additionally precise approximation of the thermodynamics and kinetics in relation to drug-target recognizing and binding, as improved algorithms and hardware constructions increase their application. Classical MD simulations nowadays permit implementation of structure-based drug design approaches which fully explains structural flexibility of the overall drug-target model arrangement (Durrant and McCammon 2011; Harvey and De Fabritiis 2012) Certainly, now it is publicly acknowledged that two main drug-binding models (induced-fit and conformational selection) have outdated Emil Fischer’s rigid lock-and-key binding paradigm (Boehr et al. 2009; Changeux and Edelstein 2011; Vogt and Di Cera 2012). Researchers have lately illustrated the supremacy of these approaches for investigating protein-ligand binding and determining the associated free energy and kinetics. Receptor and ligand flexibility is essential to precisely predict drug binding and detailed kinetic and thermodynamic properties. In consequence, classical and/or QM/MM MD simulations are no more assumed inhibitory for drug design. Alternatively, this is advancing the boundaries of computationally accelerated drug designing in both industry and academia (Borhani and Shaw 2012; Mortier et al. 2015). Some programs MD simulation includes GROMACS, CHAMM, Amber, NAMD, CPMD, CP2K etc. Their documentation can be found on their websites and other web resources (Phillips et al. 2005; Van Der Spoel et al. 2005).

6 Advantages of Molecular Dynamics Simulation

MD simulations are normally conducted at a normal temperature, comparatively low energy barricades, for example, 0.6 kcal can be simply negotiated. Therefore if the opening drug-receptor complex configuration following from binding is detached from the very steady configuration by this low barrier, molecular dynamics can reach over the barrier. Molecular simulations might recognize additional stability, hence are very realistic, conformational positions of ligand-receptor complexes (Mortier et al. 2015). Moreover, they may give unique knowledge about conformational alterations of the receptor because of ligand binding; shedding light on the close mechanisms of receptor inhibition or activation which presently cannot be investigated by some other method. Ultimately, molecular simulations usually integrate solvent, therefore, permit the including of solvent properties in the attention. The current investigation is showing the significance of MD simulation to examine the biomolecular adaptability connected with ligand identification (Nair et al. 2011, 2012; Nair and Miners 2014). Investigating the flexibility of the target receptor allow the enhanced strategy of drugs in comparison with basic lock and key concept of the static receptor.

7 Combined Docking and MD Simulations

Quick and reasonable binding rules could be coupled along with precise but additionally time-consuming MD procedures to forecast extra dependable receptor-ligand complexes. The power of these arrangements resides in the corresponding strengths and weaknesses. Although, docking procedures are utilized to discover the large conformational space of ligands in a very short time, permitting the inspection of huge libraries of drug-like compounds at a sensible cost. The key disadvantages are the absence, or reduced flexibility of the proteins, that is not permissible to regulate its conformation when joined with a ligand, and the lack of a distinctive and extensively appropriate scoring function, required to create a dependable grading of the final complexes. So, MD simulations could utilize both (ligand and protein) in a flexible manner, permitting an induced fit of the receptor-binding site nearby the newly presented ligand. Also, the consequence of explicit water molecules could be studied through, and very precise binding free energies could be gained. Though, the main problem with MD simulation is that they take time and that the arrangement could get stuck in local minima (Khandelwal et al. 2005). So, the permutation of the two procedures in a module in which docking is applied for the quick screening of huge libraries and MD simulations are then used to explore conformations of the target, optimization of the structures of the ultimate complexes, and estimate exact energies, is a reasonable strategy to refining the drug-design process (Alonso et al. 2006; Khandelwal et al. 2005). Another approach is to perform short MD simulation of the starting target is to obtain diverse conformations of the target, perform molecular docking simulations of the library against representatives of different conformations before subjecting the energetically favorable conformation(s) into MD simulations. Ligand-dependent approaches lacking the structural data of the target, ligand-based technique makes application of the data delivered by identified inhibitors for the target receptor. Structures similar to the identified inhibitors are recognized from chemical databases through a range of approaches; few of the approaches extensively applied are similarity and substructure searching, pharmacophore matching or 3D shape matching.

8 Similarity and Substructure Searching

QSAR (Quantitative structure-activity relationship) is a numerical tactic that tries to conclude the physical and chemical characteristics of molecules to their biological features. The goal of QSAR is the estimate of molecular characteristics based on structure deprived of the necessity to accomplish the experiment using in-vitro or in-vivo. It does not take times and resources. Numerous descriptors like number of rotatable bonds, molecular weight, LogP etc. are usually utilized. Numerous QSAR methods are in exercise depending on the data dimensions. It lies between 1D-QSAR to 6D-QSAR. These approaches are dependent on the postulation that the activity of some chemical compound to its structure (Damale et al. 2014; Lill 2007). Very exactly, this method states that the action, or the characteristics, for example, the toxic effect, is connected to the chemical structure via a definite mathematical algorithm, or rule. It is supposed that the existence of a particular feature in the chemical compound, that is residing or not in the structure. For example, it is widely accepted that if in the chemical compound there are some groups, like an aromatic amine, or an epoxide, then the chemical compound is genotoxic.

9 Pharmacophore Mapping

This is the method for originating 3D –pharmacophore. A pharmacophore is a feature composed with their comparative spatial alignment that is considered able of interacting with a specific biological target for example positively and negatively charged groups, donors and acceptors, hydrophobic regions and aromatic rings. A pharmacophore plot differentiates the bioactive conformation of every active molecule and entitles in what way to superimpose, relate in 3D, the different active compounds. The plot distinguishes the types of points match in what confirmation of the target. It depends on element types, chemical connectivity (Debnath 2002). A produced by the superposition of active identifies their general characteristics. Founded on the pharmacophore plot either de novo design or 3D database examining can be done. Structure bioactivity relationships may serve as a subsidiary probe of 3D structure and chemical characteristics of the macromolecular identification of a site for ligands. The purpose of pharmacophore mapping is to convert such bioactivity relations data into 3D for binding to the drug target. This to find out 3D databases molecules which match this 3D plot to create novel active molecules (Liu et al. 2010; Marchand-Geneste et al. 2002).

10 Free Energy Calculations

Energy calculations from molecular docking have limited accuracy. Hence, more precise energy predictions can be achieved by employing various strategies of molecular simulations. Methods of free energy calculations can be divided into (a) relative free energy methods (b) relative and absolute binding free energy methods.

10.1 Approximate Free Energy Methods

10.1.1 Linear-Response Approximation Methods

This method is based on a sampling of end states of target-ligand complex, unliganded target, and ligand. The most simple method of binding free energy estimation is the linear-response calculation where the electrostatic free alteration is predicted on the principle of electrostatic interaction energy among the ligand and the environment around it (drug target or solvent) (Tao et al. 1996). The linear-response approximation methods of energy computation can be classified into (a) linear interaction energy (LIE) method, (b) semi-macroscopic protein-dipoles Langevin-dipoles method, (c) molecular mechanics with Poisson-Boltzmann (or Generalized Born) and surface area solvation (MM-PBSA/MM-GBSA). The foremost contemplation of the LIE technique is only convergent average of interaction energies among the ligand and environment around it, essential to be estimated to get an estimate of binding free energies (de Amorim et al. 2008). The master LIE equation is based on the electrostatic and van der Waals energies for the ligand which are calculated from molecular simulations (molecular dynamics (MD) or Monte Carlo (MC) simulations). Implementation of LIE method requires MD/MC simulations the ligand-free in solution and one solvated ligand bound to the drug target. The most comprehensive endpoint methods currently are molecular mechanics along with Poisson-Boltzmann (or generalized Born) and surface area (MM-PBSA/MM-GBSA).MM-PBSA/MM-GBSA method utilizes a range of solvent model to substitute water by handling it as a universal medium (implicit solvent model).In this method, average solvation characteristics of water are collected deprived of averaging over the interactions of thousands of real water molecules that could cause notable instabilities in solute-solvent and solvent-solvent energies (Carlsson and Åqvist 2006). Ligand-solvent interaction energies could be calculated accurately through a solution of the Poisson–Boltzmann equation, or roughly by using the generalized Born theory. Since the generalized Born model is less computationally-intensive, it is more popular for MD simulations (Chen et al. 2008). Fortunately, a lot of improvements have been applied to generalized Born models, and now they are able of reaching the same level of accurateness as Poisson- Boltzmann models. These methods have been utilized in many settings, like protein design, protein-protein interactions, conformer stability and re-scoring (Brandsdal et al. 2003; Chen et al. 2008).

In the MM-PBSA approach, the free energy of a state, which is, P (free protein), L (free ligand) or PL (complex) is calculated on the basis following sum:

$$ \mathrm{G}=\mathrm{E}\kern0.5em \mathrm{bond}+\mathrm{E}\kern0.5em \mathrm{el}+\mathrm{E}\kern0.5em \mathrm{vdW}+\mathrm{G}\kern0.5em \mathrm{pol}+\mathrm{G}\kern0.5em \mathrm{npol}-\mathrm{TS}, $$

Wherever E bond, E el and E vdW are standard molecular mechanics energy expressions from bonded, electrostatic and van der Waals interactions, G pol and G npol are the polar and non-polar assistances to the salvation free energies. Polar terms are calculated via solving of Poisson-Boltzmann (MM-PBSA) or generalized Born equation (MM-GBSA), while nonpolar terms are obtained from a linear relation to the solvent available surface area (Hou et al. 2010). The procedures involve several huge estimates, for instance, a questionable entropy, missing the conformational input and disappeared effects from binding-site water molecules. Moreover, the approaches often overestimate differences between sets of ligands. However, since MM-PBSA and MM-GBSA invest extra effort in sampling and entropies, they are nearer to a true free energy scheming than docking (Mobley and Dill 2009). Average free energy of unbound ligand (G L), unbound protein (G P) and the complex (G PL) are usually estimated from the separate MD or MC simulations for each of them. This approach is called three-average MM-PBSA (3A-MM-PBSA). Though, it is very common to simulate just the complex (PL) and create the collective average of the unbound receptor and ligand by only eliminating the suitable atoms; such approach is called one-average MM-PBSA (1A-MM-PBSA). In a typical scenario, simulations used to estimate the energy terms employ explicit solvent models, but since implicit solvent models (GBSA/PBSA) are used, later all solvent molecules are deleted from each trajectory snapshot. It was also recommended that MM-PBSA calculations could be dependent only on only reduced structures instead of a wide number of MD/MC-trajectory prints. And in practice, minimized structures often give results comparable with those obtained with MD/MC-simulations. At the same time, the results of such calculations are powerfully reliant on the initial structure and ignore the dynamic effects.

Software implementation of the MM-PBSA and MM-GBSA methods. The MM-PBSA method was initially established for the AMBER software and currently is available for free in the Amber Tools. During the past decade, automatic scripts were also created for popular free simulation packages Desmond, NAMD and GROMACS, as well as for APBS software (Genheden and Ryde 2015).

Genheden and Ryde showed that LIE is two to seven times more effective in comparison MM-PBSA, because of the time-taking entropy approximation (Genheden and Ryde 2011). At the same time, MM-PBSA was shown to have better overall performance than MM-GBSA (Homeyer et al. 2014). Poor precision is one of the main problems of the MM-PBSA and MM-GBSA methods, thus sometimes making them useless upon a comparison of ligands with similar affinities. The problem with the precision is usually resolved via computing just interaction energies, investigating various MD-snapshots as possible and utilizing numerous simulations (Genheden and Ryde 2015). Generally, the accuracy of end point methods (correlation coefficients related with experiments of r 2 = 0.0–0.9, based on the protein) is usually better than for molecular docking, but worse than for alchemical perturbation algorithms (Genheden and Ryde 2015). According to an expert opinion of Genhenden and Ryde, end point methods (particularly, MM-PBSA) might be suitable to advance the outcomes of docking and virtual screening or to know detected affinities and trends. Nevertheless, they are not precise enough for later states of drug design.

10.2 Relative and Absolute Binding Free Energy Methods

Alchemical free energy approaches could be utilized to calculate either absolute binding affinities (for any particular ligand to a receptor) or binding affinities (an alteration among two or more interrelated ligands). In top optimization efforts, wherever optimization via minor, the sequential chemical alteration is of main attention, precise relative free energies can conclude wherever alterations have enhanced affinity and selectivity (Chodera and Pande 2011). Thermodynamic integration (TDI) and free energy perturbation (FEP) are normally reliable but are also more time consuming than the endpoint or docking approaches (Homeyer et al. 2014). These techniques are usually called alchemical methods, since as a replacement for simulating the binding/unbinding methods that would need a simulation several times the lifetime of the complex; the ligand is alchemically transmuted in either alternative chemical species or a nonreacting molecule via intermediate, probably nonphysical stages (Chodera et al. 2011).

Alchemical methods

  1. (a)

    Thermodynamic integration (TDI)

  2. (b)

    Free energy perturbation (FEP)

TDI gives the benefit that the accuracy of binding free energy forecasts can be improved via consequently comprising extra intermediate states (Michel and Essex 2010), so this strategy proposes the probability to begin calculations at the reduced level of accuracy and just conduct sampling wherever required. It is important due to the opposite relationship among accurateness and essential calculating time, which needs one to discover an optimal equilibrium among estimation value and computational demand (Homeyer et al. 2014). TDI transformations of one ligand in the alternative are generally directed via simulations at distinct steps. The free energy alteration ΔG for the transformation is calculated by integrating the average potential functions of the two states at each n step. To conclude the variation in the binding free energy ΔΔG among two ligands, transformations are implemented for both the complex-bound ligands and the solvated ligands. ΔΔG is computed as the variation among the individual free energies: ΔΔG = ΔG bound – ΔG solvated (Homeyer et al. 2014).

FEP simulations entrenched in statistical mechanics yield a path to take missing effects in the calculations, e.g., conformational sampling, explicit solvent, and the shift of protonation states on the binding, nonetheless they usually need widespread computational sources and skill. FEP methodology has been recognized for more than 20 years currently, but its influence on drug discovery is being recognized (Acevedo et al. 2012). The main obstacle for implementing FEP as a daily procedure in CADD is gaining dependable ΔG approximations for complex bimolecular systems contained by a rational computational time. FEP approach utilizes the classic Zwanzig expression to describe the free energy alteration via building a nonphysical path combining the desired early and last state of a system. On behalf of comparative free energies of binding, single or double topology perturbations could be prepared to change one.