Software for Drug Discovery and Protein Engineering: A Comparison Between the Alternatives and Recent Advancements in Computational Biology

Adhikary, Tathagata; Basak, Piyali

doi:10.1007/978-3-031-35205-8_9

Tathagata Adhikary² &
Piyali Basak²

Abstract

“Omic” technologies (such as genomics, transcriptomics, proteomics, and metabolomics) generate huge databases that demand computational approaches to state novel conclusions. With the advent of machine learning and artificial intelligence algorithms, the analysis of biological data and protein engineering has taken a step forward. Different virtual screening servers and standalone software paved their importance in the initial phase of drug discovery, aiding in drug repurposing and high-throughput screening. Besides, interaction networks, often encountered in polypharmacology and network pharmacology, guide a researcher in target fishing and developing drug combinations. Visualization and prediction of molecular structures, modeling antibodies, and peptides including homology modeling are crucial to bioinformaticians and clinical biologists. Biological network analysis, pharmacophore modeling, molecular docking, and dynamics simulation are broadly exploited in the domain of computational biology and elucidate the mechanisms underlying biomolecular interactions, consequently revealing the orchestra of biological pathways. Considering the intended purposes, advantages, and limitations of the existing software, this chapter highlights only a fraction of popular platforms and encourages the readers to explore other alternatives in various domains of drug discovery and protein engineering.

Access provided by Autonomous University of Puebla. Download chapter PDF

Protein binding sites for drug design

Article 09 December 2022

In silico Identification and Characterization of Protein-Ligand Binding Sites

Molecular Modeling Techniques and In-Silico Drug Discovery

Keywords

1 Introduction: The Need for Computational Biology

Discovering novel drug molecules strictly demands huge investments in terms of time, infrastructure, and labor to identify, optimize, and validate the drug-likeliness of such molecules by conducting in vitro, in vivo, and preclinical experiments (Lei et al. 2016; Rifaioglu et al. 2019). To ease the process, a shift toward the application of computational tools is witnessed in the early stages of identifying drug-like molecules. Constant advancements in software and its algorithms aid to bridge the “innovation gap” that exists due to higher investments and lower approval rates. The process of drug discovery sequentially includes the identification and validation of disease targets, lead compound identification and its optimization, and finally success in clinical trials. Accordingly, establishing a drug can take around 10 to 13 years with huge capital expenditure (Malathi et al. 2018). Challenges arising due to the pleiotropic nature of biomolecules and the interaction of chemical compounds with multiple pharmacological targets (often encountered in combinatorial/multitargeted approaches) can be addressed by chemo- and bioinformatics tools that make use of databases on physicochemical characteristics and therapeutic use of compounds (Lagunin et al.). The fact that the primary healthcare of 80% of the population in developing countries counts on the conventional herbal remedies and the steep rise of 380% in plant-based supplements’ sales in the United States from 1990 to 2000 encouraged the development of numerous databases on ethnomedicine (Dunkel et al. 2006; Mosihuzzaman and Choudhary 2008). This expands the prospects of utilizing traditional knowledge on medicinal plants in modern-day drug discovery and drug repurposing.

In 1971, the database Protein Data Bank (PDB) came into existence, being the first open-access digital repository in the field of biology. It is a collection of 3D structures (resolved by laboratory experimentations namely X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy) of biological macromolecules and receptor–ligand complexes (https://www.rcsb.org/) (Burley et al. 2017). ChemCom (Chemical Comparator) is an application based on Java Web Start (JavaWS) technology and includes UnionBit Tree Algorithm to search and compare large chemical libraries (Saeedipour et al. 2015). The list of such repositories can be long enough (Lagunin et al.). A few open-source databases on medicinal plants, phytochemicals and other chemical compounds can be listed as follows: Plants For A Future (PFAF), Indian Medicinal Plants Phytochemistry And Therapeutics 2.0 (IMPPAT 2.0), Native American Ethnobotany database, SuperNatural 3.0, The Natural Compound (NC) collection, NCBI PubChem, ChEMBL, Collection of open natural products (COCONUT), Traditional Chinese Medicine Information Database (TCMID), Dr. Duke’s Phytochemical and Ethnobotanical Databases, Aromatic and Medicinal Plants Index (by Purdue University), Agricultural Science and Technology (or AGRIS supported by the Food and Agriculture Organization (FAO) of the United Nations), Compendium of Ayurveda Medicinal Plants of Sri Lanka, Botanical.com, Chinese Herbal Medicine Dictionary (by Complementary and Alternative Healing University), Clinicaltrials.gov database, Medicinal Plant Database (by Botanical Survey of India), EcoPort, TIPdb (a database of indigenous and endemic plant species in Taiwan), Traded Medicinal Plants Database, Herbal Medicines Compendium Medicinal Herbs and Plant Database, Drugs Herbs and Supplements by MedlinePlus, ZINC database, Marowina database medicinal support, Natural Medicines, Herbs at a Glance, Prelude Medicinal Plants Database, Raintree Tropical Plant Database, The World Flora Online, and TRAMIL database (Xie et al.; Duke 2020). Commercial databases with paid access include Chemical Abstracts Service (CAS), HerbalThink-TCM, Dictionary of Natural Products (DNP), and HerbMed.

The increasing data relating to the bioactivities of a chemical compound, the composition of phytoconstituents in an extract, and the target receptors responsible for specific bioactivity need to be stored and be able to be retrieved systematically. “Omic” technologies have led to the development of diverse databases, and interpreting/interconnecting them or data mining from them is a major challenge to human capabilities. Hence, computational approaches including artificial intelligence and machine learning algorithms (e.g., artificial neural networks (ANN), Naive Bayes, K-Means, support vector machine (SVM), random decision forest, etc.) are widely adopted to provide solutions to complex biological questions (Gupta et al. 2021; Muzammil et al. 2023). Continuous development of in silico tools for chemoinformatics and bioinformatics provides insight to the vast multiomics data and adds different perspectives to the scientists in the domain of drug discovery. Chemoinformatics particularly aims to model a statistical correlation between the observed bioactivity and structural parameters. These approaches relating to computer-aided drug design have gained noteworthy momentum in the drug discovery process. Genome-wide functional genetic screening (e.g., using deep learning algorithms) is a cutting-edge technique that has led to the discovery of genotype–phenotype interconnections and established new phenotypes (Zhang et al. 2011). Genomics and proteomics analyses in high-throughput screening have shown promising results to rationalize the drug discovery process; however, the cost inflation incurred due to these technologies does not meet the expected growth of the drug’s approval rate. Freely available software that are frequently employed in machine learning and statistical analysis of data are R, PSPP by the GNU Project, and WEKA while commercial ones include MATLAB, SAS/STAT, SIMCA, SPSS Statistics by IBM, and TIBCO Data Science/Statistica (Dzemyda et al. 2019).

The first thing that needs to be checked while selecting a software for computer-aided drug designing is its vendor and license—whether it is under academia, commercial, open-source, or in-house software. Open-source software are popular among academic personnel as, unlike commercial software, no license fee is required, their source code is made available freely and can be modified by a user. Based on the intended use, license fee, and characteristic features of the software/platforms, attempts are made to categorize and list the in silico tools employed in the various domains of drug discovery (Singh et al. 2021). Molecular docking, pharmacophore modeling, methods relating (Q)SAR, molecular dynamics simulation, network pharmacology and machine learning algorithms accelerate the drug discovery process and complement the traditional bioactivity-guided fractionation, high-throughput screening and systems biology approaches. In this chapter, the tables summarizing the in silico tools only provide a fraction of popular platforms and encourage the readers to explore other alternatives in various domains of drug discovery and protein engineering.

2 Visualization of Molecular Structures

Molecular graphics enhances the experience of representing, modeling and analyzing multifaceted biochemical systems. Besides modeling the 3D architecture of molecular structures, 2D illustrations of molecules have gained interest among chemical scientists and biologists in the field of theoretical chemistry and discovery because of their clear representation of structural characteristics and interactions between atoms (Zhou and Shang 2009). Visualizing molecular structures in any virtual reality environment demands rapid high-quality rendering of geometries to build molecular models with intuitive and informative interactions. Different visualization techniques (such as the space-filling model, the ball-and-stick model, and the reconstruction of the surface of the secondary structures alpha helixes and beta sheets) are used while representing a molecular model, and some available platforms to design, analyze and visualize molecular structures are listed in Table 1.

Table 1 Software/servers for designing and visualizing molecular structures

Full size table

3 Prediction of Pharmacokinetic/Pharmacodynamic Profile

Evaluating the ADMET (absorption, distribution, metabolism, excretion and toxicity) properties of a molecule is a major step in discovering novel drug compounds. In general, compounds having natural origin tend to have desirable ADMET properties compared to synthetic compounds. Early prediction of ADMET properties of a chemical compound can be of utmost importance since most drug failures occur in the later phases due to undesirable pharmacokinetics and toxicological characteristics. Lipinski’s rule of five is often checked to predict the drug-likeliness (in humans) of an oral-administered compound (Lipinski 2004; Rego et al. 2022). According to it, a drug molecule can have at most one violation among these five rules: (a) ligand’s molecular weight should be less than or equal to 500 Daltons, (b) the number of H-bond donors should be less than 5, (c) the number of H-bond acceptor should be less than 10, (d) value of octanol partition coefficient (miLogP) should be less than 5 and (e) the number of rotatable bonds should be less than 10.

Most of the software packages that predict the ADMET of compounds (e.g., their affinity toward transporter proteins, blood proteins and drug-metabolizing enzymes P450 cytochromes isoforms, etc.) consider their structural/physicochemical characteristics to develop (Q)SAR models. Derek Nexus (Lhasa Ltd.), TOPKAT (Accelrys), OSIRIS Property Explorer, MCASE (Multicase) and PASS can be opted to predict various toxicities and report the teratogenic, mutagenic, cardiotoxic, hepatotoxic, carcinogenic, and renal-toxic nature of the compounds (Kar et al. 2018). The online server of GUSAR (www.way2drug.com) predicts the LD₅₀ values of query compounds on rodents when administered via four different routes. Some other software/web-servers to study the ADMET properties are listed in Table 2.

Table 2 In silico tools used in the prediction of ADMET properties of small molecules

Full size table

4 Prediction of Structures Including Homology Modeling

Molecular modeling based on structure-based drug designing requires 3D structures of the receptor and ligand molecules (experimentally determined by X-ray crystallography and NMR spectroscopy). In cases where experimental data are unavailable, the existing data and sequences can be used to predict the structures by homology-based modeling, sometimes referred to as comparative modeling of protein. The amino acid sequence of a protein (acquired from NCBI or UniPort) is used to generate the structure using computational tools. Evolutionarily related proteins share a similarity in sequences and homologous proteins exhibit similarity in their protein structure (substitution matrices such as Blosum 60 describe such homology). The 3-D protein structure is found to be evolutionarily more conserved compared to the sequence conservation alone (Kaczanowski and Zielenkiewicz 2010). Homology modeling starts with recognizing a template that shows similarity in sequence (searching is accomplished by employing BLAST (Basic Local Alignment Search Tool) or PSI-BLAST (Position-Specific Iterated BLAST) or fold recognition methods) and subsequent alignment of the known structures (resolved by experiments) in the database. A similarity of less than 30% is generally not preferred in homology modeling. BLAST compares a query sequence with the existing database and identifies the most suitable sequence with significant similarity, i.e., it identifies the homologous sequences. Alignments with an expectation value (E-value) closer to zero indicate a higher similarity index. A higher E-value makes the alignment of two sequences strenuous, thus considering sequences from other homologous proteins can help in this scenario (Pearson 2013; Alves et al. 2023). Multiple Sequence Alignment programs, e.g., CLUSTALW, can align sequences by performing insertions and deletions. Alignment correction, if done not properly, will generate defective structures. Some of the methods that are used to build models are spatial restraint, rigid-body assembly, segment matching and artificial evolution. Modeling tools, namely Modeller or CASP, can be used to build the backbone from the aligned sequences. Most often, aligning the model sequence with the template sequence creates gaps that can be resolved by considering conformational changes, insertions/deletions/substitutions of amino acid residues. Thus, refining the model includes loop modeling and side-chain modeling following the principles of molecular dynamics, Monte Carlo, and genetic algorithms. After modeling, structures are energetically minimized by employing force fields (for instance OPLS, AMBER, MM3, and CHARMM22 force fields) (Lewis-Atwell et al.). Loop modeling can be knowledge-based or energy-based. Knowledge-based loop modeling, sometimes referred to as template-based or homology-based, searches existing databases to identify known loop conformations that match the input sequence and geometric descriptors about the anchoring points (Karami et al.). It does not require complex simulations and high computation power; however, it relies on the availability of appropriate loop conformations present in the existing repositories of protein structures to consider the entire conformational space. Energy-based loop modeling corresponds to nontemplate-based or de novo methods that use an energy function and minimizes it by Monte Carlo methods or molecular dynamics to optimize the loop conformation. Proteins that share structural similarity also exhibit similarity in torsion angle about Ca-Cb bond (psi angle) when side-chain conformations are considered. The entire conserved residues can be taken from the template and copied to the model to yield highly accurate results when compared to the methods that copy the backbone or predict the side chains. Modeling of side chains includes knowledge-based methods to extract a library of rotamers from known crystallographic structures and substitutes the side chains on the backbone structure. After side chain modeling, the analysis is done by using their root mean square deviation (RMSD) values. The errors found in the final model are dependent on the extent of similarity between the template and the target. If it is >90%, then the crystallographic structure is fairly predicted, whereas for a value <90%, the r.m.s.d errors will be significant. The estimation of errors can be done by using a force field to calculate the model’s energy and checking if the bond lengths and angles are exhibiting a value in the normal range (Dolan et al. 2012; Wink et al. 2019; Lima et al. 2022). However, this method does not evaluate the folding nature of the model and the misfolding in proteins is taken care of by 3D distribution functions. Model validation is necessary to establish the prediction accuracy.

The stereochemical aspects of the protein can be explored by WHATCHECK, WHAT IF, VADAR, and PROCHECK. Ramachandran plot, obtained by plotting the torsional angles of amino acids φ (phi) and ψ (psi) in a protein sequence two-dimensionally, is used to analyze the stereochemical and geometrical nature of the structure and verifies the presence of geometries in the electrostatically unfavored regions of the plot. A higher proportion of residues in the favored region indicates the structural feasibility of the model (Agnihotry et al. 2022). Popularly used tools for homology-based modeling are MODELLER, SWISS PDB VIEWER, SWISS MODEL and COMPOSER (Malathi et al. 2018). MODELLER is also used for sequence searching, comparing and clustering protein structures or sequences. In brief, steps in homology modeling take into account template identification, sequence alignment, structural modification, energy minimization and model validation to predict the 3D structure.

5 Interaction Networks

Hopkins in 2007 brought the concept of network pharmacology that makes use of network analysis algorithms (on the existing knowledge of biological networks consisting of structural/physicochemical properties of proteins/ligands, the interaction of a protein/gene with another protein/gene/ligand, signaling and metabolic pathways) to predict the therapeutic action of small molecules, elucidate their mechanism of action, and understand the drug-disease relationships at the system-level (Csermely et al. 2013). Visualization of biological networks (such as pie-nodes and edge-pie matrix visualization) and network comparison (by employing network alignment and computing pair-wise similarity between selected networks) is essential for network analysis, identification of key components/nodes/interactions in a concerned biological system, and highlighting the union/intersection/complement regions in a set of biological networks. Networks have the capability to highlight the interacting elements within a complex biochemical system, thus aiding in the visualization and exploration of big data. However, the challenges relating the large size and high complexity of biological networks generate the so-called “hairballs” in the networks. Hence, there is a need for an efficient and interactive graphical user interface for network comparison and visualization (Pirch et al. 2021; Almeida et al. 2022). One needs to consider several types of relationships (namely “target–effect,” “target–pathway,” “pathway–effect,” and “target–pathway–effect” relationships) to investigate the pleiotropic and synergistic effects of a drug compound or a combination of drug compounds. The benefit of conceptualizing such “cause–effect” relationships unfold gradually—if the bioactivity of a drug relates to certain molecular targets and their corresponding pathways are established, then other mode-of-actions of influencing those pathways can yield similar effects.

Analysis of biological pathways (such as signaling pathways, regulatory pathways, metabolic pathways, signal transduction pathways, etc.) makes use of various pathway databases (Lagunin et al.). To name a few, WikiPathways (https://www.wikipathways.org), HumanCyc (https://humancyc.org/), NetPath (http://www.netpath.org/), Reactome (https://reactome.org/), KEGG (https://www.genome.jp/kegg/pathway.html), SignaLink (http://signalink.org/), and Small Molecule Pathway Database (https://www.smpdb.ca/). QIAGEN Ingenuity Pathway Analysis (IPA) is an online platform that is used to analyze, integrate, model and interpret the nexus of data from “omic” technologies including RNAseq experiments and Single-Nucleotide Polymorphism (SNP) microarrays. It aids in the identification of genes and pathways that functionally interact with the drug molecules and compares the gene regulatory circuits involved in the phenotypic responses. Connectivity Map (CMap) connects the genes and the drugs (currently in use) underlying various diseases and enables us to perform data-driven analysis of repurposing/reprofiling/repositioning of drugs (it does so by analyzing the disease-specific and drug-specific gene signatures). A user provides the “gene hit lists” (aka “signatures”) to the CMap for its comparison with a gene differential expression (DE) database (obtained by perturbation of cell lines with numerous drug-like molecules) to output a rank of compounds that exhibit similarity in expression patterns considering the query hit list. The CMap resource hosts over 1.5 million gene expression profiles from around 5000 chemical compounds and 3000 genetic reagents that are tested in various cell lines (Lim and Pavlidis 2021). The similarity in the gene expression profiles based on drug–drug, drug–disease, and disease–disease relationships is used to create the disease–drug networks for studying the potential side effects, targets and pathways associated with the drug compound. Aside CMap, Gene Expression Omnibus (GEO), and the Comparative Toxicogenomics Database (CTD) can be opted to create such disease-specific gene expression signatures. DIGEP-Pred is a free web-based platform that considers the structural characteristics of compounds to predict drug-induced variations in gene expression profiles (Lagunin et al. 2013). Natural Product-based Drug Combination and Its Disease-specific Molecular Regulation (NPCDR) is an interactive database that shares knowledge on drug combinations (of natural products) with clinical or experimental validations. It also provides information on disease-specific molecular recognition and pathways and allows integration of available databases, easing the research on network pharmacology and medicinal chemistry (Sun et al. 2022).

The platforms that are free for academic use in bioinformatics and systems biology researches to analyze complex data from “omic” technologies are OmicsNet (https://www.omicsnet.ca/) (Zhou and Xia 2018), Cell Illustrator (http://www.cellillustrator.com/home), Cytoscape (https://cytoscape.org/), ConsensusPathDB (http://cpdb.molgen.mpg.de/), Gene Set Enrichment Analysis or GSEA (https://www.gsea-msigdb.org/gsea/index.jsp), The Database for Annotation, Visualization and Integrated Discovery or DAVID (https://david.ncifcrf.gov/), VANESA (https://cbrinkrolf.github.io/VANESA/). Other software with paid licenses include the geneXplain platform (https://genexplain.com/), QIAGEN Ingenuity Pathway Analysis (https://digitalinsights.qiagen.com/products-overview/discovery-insights-portfolio/analysis-and-visualization/qiagen-ipa/), and Elsevier’s Pathway Studio (https://www.elsevier.com/en-in/solutions/pathway-studio-biological-research). Other alternatives that can be explored in this domain of research are presented in Table 3.

Table 3 Some software/servers to generate, visualize, and analyze biological networks

Full size table

6 Pharmacophore Modeling and Molecular Docking

From a large library of chemical compounds, virtual screening identifies the lead compounds having a specific bioactivity. There exists structure-based and ligand-based virtual screening. The former approach utilizes the 3D structure of the target protein and performs molecular docking to report the potential active compounds that exhibit good binding affinity/score with the target receptor structure. Molecular docking is a structure-based approach and is used in the prediction of the 3D orientation of the ligand molecule with respect to a particular conformation of the receptor molecule when both are interacting and forming a stable complex (Sahoo et al.). It is one of the first-line tools used in discovering/designing novel drug molecules that predict the binding affinity of a chemical compound with the target receptor and ranks the ligands based on their respective docking scores. Molecular docking predicts the atomistic model of the receptor–ligand interactions and their binding orientations. In site-specific or targeted docking, the active sites of the target protein are reviewed or predicted by using programs such as CASTp, Q-SiteFinder, LigA Site, and MetaPocket, while blind docking considers the entire protein structure as the probable region of ligand interaction (Wong and Kwan 2015). Searching algorithms that fish out favorable conformations from infinite possibilities include matching algorithms, incremental construction methods, multiple copy simultaneous searching, Monte Carlo and genetic algorithms. Scoring functions (either empirical, force field, or knowledge-based) of a docking software estimate the binding affinity of the ligand with the target receptor and rank the ligands based on docking scores.

Qualitative “structure–activity relationships” (i.e., SAR) and quantitative structure–activity relationships (i.e., QSAR) are used in virtual screening (and target fishing) if the structures of the chemical compounds are available or predicted/designed. These approaches assume the bioactivity of a ligand as a function of its structural or physicochemical characteristics. Analysis and comparison of the structures are achieved with the help of some descriptors (such as structural fragments, fingerprints, constitutional, topological, electro-topological, quantum-chemical and physicochemical descriptors) (Lagunin et al.). Pharmacophore modeling considers a group of atoms in the structure whose presence directs the pharmacological effect of the ligand. Ligand-based virtual screening employs QSAR approaches that aim to develop mathematical models to study the correlation between the observed bioactivities and structural/physicochemical characteristics. Software such as Sybyl-X 2.0 and E-Dragon perform QSAR studies (Browne et al.; Fedyushkina et al. 1990).

Two techniques, namely comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA), are encountered in 3D OSAR for ligand-based drug designing (Chavda and Bhatt 2019). In CoMFA, a library of ligands comprising their physicochemical characteristics and biological activities is created. These bioactive compounds vary between themselves by some substitutions. Seventy percent of data in the database is fed as the input to the training set (regression models are generated using it following the Partial Least Squares (PLS) regression and correlating the models with the pIC50 value), whereas the rest of the 30% data is kept as the test set (used to establish the prediction accuracy of the QSAR regression models). Finally, the models undergo leave-one-out (LOO) cross-validation. The descriptors in CoMFA are determined by the sp³ probe. Columbic potential energy calculates the electrostatic field and Lennard Jones potential energy describes the bond energy curves for Van der Waals bonding. The 3D steric and electrostatic contour plots depict the variation in bioactivity with the alteration of molecular fields. The SEAL similarity method in CoMSIA takes into account the electrostatic, steric, hydrogen bonding and hydrophobic descriptors to predict the similarity between molecules using Gaussian functions. The contour plots produced by the CoMSIA portray the favorable and unfavorable regions for the interaction of ligands (Bordás et al. 2003).

Approaches to build pharmacophore-based models identify the molecular characteristics that direct the macromolecular recognition of ligands, thus triggering the biological response. The aromaticity, hydrophobicity, presence of hydrogen bond acceptors/donors and anion/cation residues are considered to model pharmacophores that act as a query to search the potential bioactives from a database of compounds in virtual screening. Developing pharmacophore models can follow either structure-based or ligand-based approaches. The former approach relies on the availability of X-ray crystallographic or NMR spectroscopic 3D structure of the receptor molecule/target protein. The active sites and the spatial interactions are described by certain physicochemical properties that complement the interacting ligands and selectively identify the compounds with high binding affinity. A good model must incorporate protein flexibility to consider the structural changes that occur during the formation of the receptor–ligand complex. Ligand-based modeling is useful in cases where the 3D molecular structure of the receptor molecule is not available and the pharmacophores are generated by studying the common features (e.g., hydrophobic and electrostatic interaction, hydrogen bonding, etc.) that exist at the same position in the ligand structures. In ligand-based pharmacophore modeling, chemical compounds in the training set create a conformational space that takes care of ligand flexibility (Braga et al.). HipHop, DISCO, HypoGen, and PHASE are some software for generating pharmacophore models.

The structural data generated by NMR, X-ray crystallography, and homology modeling are static in nature that fails to describe the dynamic nature of the biorecognition process during receptor–ligand binding. These experimental data highlight the binding sites for some endogenous agonists; however, other active sites (including the allosteric and cryptic binding sites) are often not identified. Neither the receptor nor the ligand is a frozen/rigid entity; instead, the structures are interacting under constant motion in a solution (any biological fluid). Moreover, an approaching ligand can cause a series of conformational changes in the receptor structure to improve its binding affinity. In order to consider the flexibility of the macromolecular structures, the relaxed complex scheme (RCS) has been developed that extracts several conformations of the receptor sampled using simulation and then performs molecular docking of the ligand with each of the conformations. The scoring functions often consider conformational entropy and solvation energy as negligible parameters while calculating the binding affinity to make the process computationally less expensive (but compromising with the model’s accuracy).

Often researchers employ both QSAR modeling and molecular docking to predict the bioactivity and investigate the mechanism of action of compounds. In a study, the immunomodulatory effect of the ligands is evaluated by employing forward stepwise multiple linear regression to develop a QSAR model with 52 physical-chemical descriptors (important ones are namely dipole moment, steric energy, amide group count, ƛmax (UV-visible) and molar refractivity) using the SCIGRESS platform. Finally, molecular docking is performed to predict their binding affinity with immunomodulatory targets namely TLR-4, iNOS, COX-2, CD14, IKK b, CD86, and COX-1 (Yadav et al. 2010). A similar QSAR model with 50 descriptors from SYBYL-X 1.3 is used to study the cytotoxicity of ursolic acid analogs against human glioblastoma and lung cancer cell lines. The model exhibited a good regression coefficient (r²) and the cross-validation regression coefficient (r_cv²) (values ranging from 0.8 to 0.96). The relevant parameters for cytotoxicity are found to be LUMO energy, ring count, dipole vector and solvent-accessible surface area (Kalani et al. 2012).

Some freely available software and webservers to generate descriptors (that include arithmetical, topological, constitutional, geometrical, electrostatic, thermodynamic, quantum-chemical descriptors and other molecular fingerprints) are AFGen (http://glaros.dtc.umn.edu/gkhome/afgen/overview), ISIDA-fragmentor (https://complex-matter.unistra.fr/equipes-de-recherche/laboratoire-de-chemoinformatique/software-development/#c89382), E-DRAGON (http://www.vcclab.org/lab/edragon/), Open3DQSAR (https://open3dqsar.sourceforge.net/), ToMoCoMD-CARDD (http://tomocomd.com/), MOLGEN (http://molgen.de/?src=documents/molgenqspr.html), Mold2 (https://www.fda.gov/science-research/bioinformatics-tools/mold2), Toxicity Estimation Software Tool or TEST by United States Environmental Protection Agency (https://www.epa.gov/chemical-research/toxicity-estimation-software-tool-test) and Open Babel (http://openbabel.org/wiki/Main_Page) while commercial alternatives are The CODESSA PRO project (http://www.codessa-pro.com/). Along with the model’s high internal accuracy (i.e., R² > 0.9 and Rcv² > 0.8 calculated using the training set only), external validation of the (Q)SAR model with experimental data is desirable as per the OECD guidelines (www.oecd.org/env/ehs/risk-assessment/37849783.pdf). In order to better correlate the structural characteristics with the bioactivities, one must use molar units (such as mol/kg, or mmol/kg) instead of mass units (i.e., mg/kg) in the models (Dearden et al. 2009). In inverse docking or target fishing, identification of the possible targets/receptors for the query ligand is performed by software such as GOLD, FlexX, TarFisDock, TarSearch-X, and TarSearch-M.

The evaluation of the bioactivities of a novel compound (i.e., the potential drug targets) can be accomplished by using pair similarity with known compounds (e.g., ChEMBL database calculates the Tanimoto coefficient based on fingerprints), molecular docking, pharmacophore modeling, Bayesian statistics and designing substructural descriptors or fingerprints. However, one must take to avoid the “activity-cliff” problem in the model that arises when the compounds share analogous structural characteristics but exhibit dissimilar bioactivity spectra. Despite being a rapid and efficient technique in virtual screening, pharmacophore modeling essentially relies on the knowledge of reported active ligands, necessitates sampling conformers using a search algorithm, and is based on a rigid framework for searching hit compounds from the database (Horvath 2010; Kaushik et al. 2018; Lans et al. 2020).

Some platforms to perform protein-protein or protein-DNA docking include SPServer (http://aleph.upf.edu/spserver/), pyDockDNA (https://model3dbio.csic.es/pydockdna), CoDockPP (http://codockpp.schanglab.org.cn/), DOCKSCORE (http://caps.ncbs.res.in/dockscore/), PIIMS Server (http://chemyang.ccnu.edu.cn/ccb/server/PIIMS/index.php), GalaxyDomDock (https://galaxy.seoklab.org/cgi-bin/submit.cgi?type=DOMDOCK_INTRO), P3DOCK server, HDOCK server, and GRAMM (Global RAnge Molecular Matching) (https://gramm.compbio.ku.edu/). ezCADD is a fast 2D/3D molecular visualization software that allows small-molecule docking, protein-protein docking, prediction of binding sites, identification of drug targets, homology modeling and structure quality assessment (Tao et al. 2019). FragVLib is an open-source software (distributed under the GNU General Public License) that generates a virtual library of ligand fragments (used for structure-based drug designing) by searching the binding pocket similarity considering a database of ligand-receptor complexes (Khashan 2012). eMolFrag is used for the virtual fragmentation of molecules and extracts the molecular fragments to build a library for virtual screening (Liu et al. 2017). Other software/servers used in virtual screening (structure-based and/or ligand-based) can be listed as follows, although other popular platforms do exist: DENVIS (https://github.com/deeplab-ai/denvis), ReMODE (Receptor-based MOlecular Design for de novo drug designing available at http://cadd.zju.edu.cn/relation/remode/), Pocket2Drug (https://github.com/shiwentao00/Pocket2Drug), DrugRep (http://cao.labshare.cn/drugrep/), DockingPie (a docking plugin for PyMOL), CB-Dock2 (https://cadd.labshare.cn/cb-dock2/php/index.php), PharmRF (https://github.com/Prasanth-Kumar87/PharmRF), DeepDock (https://github.com/OptiMaL-PSE-Lab/DeepDock), Knime workflow (https://hub.knime.com/), RNALigands (http://rnaligands.ccbr.utoronto.ca/php/downloads.php), AutoDock Vina (https://vina.scripps.edu/), eSPC (https://spc.embl-hamburg.de/), RASPDplus (https://github.com/HITS-MCM/RASPDplus), LigRMSD (https://ligrmsd.appsbio.utalca.cl/), LeDock (http://www.lephar.com/index.htm), VSpipe (https://github.com/sabifo4/VSpipe), PyRx (https://pyrx.sourceforge.io/), LiSiCA ((Ligand Similarity using Clique Algorithm available at http://insilab.org/lisica/), ALIDE (http://chemyang.ccnu.edu.cn/ccb/server/AILDE/), Open3DALIGN (https://open3dalign.sourceforge.net/), PrepFlow (https://ifm.chimie.unistra.fr/prepflow), QSAR-Co-X (https://github.com/ncordeirfcup/QSAR-Co-X), PyRMD (https://github.com/cosconatilab/PyRMD), SwissSimilarity (http://www.swisssimilarity.ch/), PharmMapper (http://lilab-ecust.cn/pharmmapper/), and ZINCPharmer (http://zincpharmer.csb.pitt.edu/).

7 Molecular Dynamics Simulation

The deterministic approach of the quantum-mechanical model of motion in the macroscopic world contrasts the use of probability functions that describe the motion in the microscopic world. This is because the electron clouds (that interact while bonding) exhibit wave-particle duality and not simple mechanical bonding. Simulating the system of proteins and other receptor molecules interacting with ligands at the atomistic level has paved its importance to the drug discovery process. The breakthroughs in hardware-based computational power and the development of new algorithms ease the calculation of molecular forces that exist in the system. The limitations of the conventional “lock and key” model of receptor–ligand interaction (where the receptor is held rigid and conformational sampling of the ligand is done, restricting the atomistic motions to keep the model simple) are overcome by such simulations. This considers the dynamic nature of the proteins, thus sampling numerous conformational states and selectively stabilizing them when an agonist or antagonist interacts. Any simulation starts with the modeling of the receptor–ligand system (using the data obtained from NMR, crystallography, or homology modeling), subsequently, the forces experienced by each atom (present in the system) are estimated and the positional changes of atoms are done following Newton’s laws of motion. These forces are the results of bonded interactions (i.e., charged/electrostatic interactions that use Coulomb’s law to generate the model) and nonbonded interactions (i.e., van der Waals interactions that use the Lennard-Jones 6–12 potential for modeling). Virtual springs and sinusoidal functions are used in the estimation of the difference in potential energy between eclipsed and staggered conformations. The parameters used in the functions identify the stiffness and lengths of the springs, estimate the atomic angles (and dihedral angles), calculate the partial atomic charges (responsible for electrostatic interactions), and predict the van der Waals atomic radii. These parameterizations form the basis of a “force-field” that depicts the nature of molecular dynamics under the influence of several atomic forces. Finally, the simulation time is advanced (by 1–2 femtoseconds, i.e., 10⁻¹⁵ s), and the process is iterated (in the order to 10⁶) (Durrant and McCammon 2011). Different force fields exist depending on how they are parameterized, although they mostly generate similar outputs. AMBER, CHARMM, and GROMOS force fields are generally encountered in simulation modeling. Molecular dynamics simulation demands performing a huge number of calculations; hence, computer clusters or supercomputers with numerous processors need to operate parallelly. Message Passing Interface (MPI) compatible simulation software like NAMD, CHARMM, and AMBER help in connecting multiple processors so that they can be simultaneously used to execute a complex assignment. Such simulations can estimate the values of NMR-related parameters (e.g., spin relaxation), thus allowing comparison between the theoretical prediction and experimental value.

Simulating molecular systems follows Newton’s laws of motion. Such simulations output trajectory graphs for evaluating the stability of the target protein or its docked complexes. In order to perform molecular dynamics simulation, the protein topology is generated by applying force fields such as Amber and Gromos (using GROMACS or LEaP program), while the PRODRG server can be used for obtaining ligand topology (Strasser and Wittmann 2013). The structures are placed inside a cube and solvation is done using the flexible simple point-charge (SPC) water model. Followed by system neutralization, the steepest descent algorithm minimizes the energy of the system. At a particular temperature (let’s say 300 K), position-restraining simulations are performed for a certain period of time under constant volume and temperature dynamics (NVT) and pressure and temperature dynamics (NPT). LINear Constraint Solver (LINCS) algorithm is frequently reviewed for molecular simulations with bond constraints (Hess et al. 1997). The Particle Mesh Ewald algorithm estimates the electrostatic energy (Madelung energy) of the complex/crystal. After performing the molecular dynamics simulation, the trajectories (w.r.t. time) are generated by the XMGrace tool and the parameters namely the root mean square deviation (RMSD), root mean square fluctuation (RMSF), the radius of gyration (Rg), and intermolecular hydrogen bond formations are considered to analyze the stability of the protein-ligand complex (Van Der Spoel et al. 2005). The advantages of molecular dynamics simulation come with a cost—the process becomes computationally expensive. Lower simulation time will reflect the inadequacy (of the model) in the conformation sampling step. Force fields are used in the approximation of the quantum-mechanical model of motion at the atomistic level; hence, molecular dynamics simulations fail largely for the systems having dominant quantum effects such as bonds involving transition metal atoms (Durrant and McCammon 2011). The tools/platforms that can be employed to perform molecular dynamics simulations and analyze the output files post simulation are reviewed in Table 4.

Table 4 Platforms to perform molecular dynamics simulations and analysis of output files

Full size table

8 Conclusion

To ease the process of drug discovery, a shift toward the application of computational tools is witnessed in the current era of research. Challenges arising due to the pleiotropic nature of biomolecules and the interaction of chemical compounds with multiple pharmacological targets (often encountered in combinatorial/multitargeted approaches) can be addressed by chemo- and bioinformatics tools that make use of databases on physicochemical characteristics and therapeutic use of compounds. Early prediction of ADMET properties of a chemical compound can be of utmost importance since most drug failures occur in the later phases due to undesirable pharmacokinetics and toxicological characteristics. Simulating the system of proteins and other receptor molecules interacting with ligands at the atomistic level has paved its importance to identifying novel drug-like compounds. This considers the dynamic nature of the proteins, thus sampling numerous conformational states and selectively stabilizing them when an agonist or antagonist interacts. The breakthroughs in hardware-based computational power and the development of new algorithms ease the calculation of molecular forces that exist in the system. Biological networks have the capability to highlight the interacting elements within a complex biochemical system, thus aiding in the visualization and exploration of big data. In brief, molecular docking, pharmacophore modeling, methods relating (Q)SAR, molecular dynamics simulation, network pharmacology, and machine learning algorithms accelerate the drug discovery process and complement the traditional bioactivity-guided fractionation, high-throughput screening, and systems biology approaches. The examples that are listed/tabularized in this chapter highlight only a fraction of popular software/platforms and encourage the readers to explore other alternatives in various domains of drug discovery and protein engineering.

References

Agnihotry S, Pathak RK, Singh DB, Tiwari A, Hussain I (2022) Protein structure prediction. In: Bioinformatics: methods and applications. Academic Press, pp 177–188. https://doi.org/10.1016/B978-0-323-89775-4.00023-7
Chapter Google Scholar
Almeida VM, Dias ÊR, Souza BC, Cruz JN, Santos CBR, Leite FHA, Queiroz RF, Branco A (2022) Methoxylated flavonols from Vellozia dasypus Seub ethyl acetate active myeloperoxidase extract: in vitro and in silico assays. J Biomol Struct Dyn 40:7574–7583. https://doi.org/10.1080/07391102.2021.1900916
Article CAS PubMed Google Scholar
Alves FS, Cruz JN, de Farias Ramos IN, do Nascimento Brandão DL, Queiroz RN, da Silva GV, da Silva GV, Dolabela MF, da Costa ML, Khayat AS, de Arimatéia Rodrigues do Rego J, do Socorro Barros Brasil D (2023) Evaluation of antimicrobial activity and cytotoxicity effects of extracts of Piper nigrum L and piperine. Separations 10. https://doi.org/10.3390/separations10010021
Aydlnkal RM, Serçinoǧlu O, Ozbek P (2019) ProSNEx: a web-based application for exploration and analysis of protein structures using network formalism. Nucleic Acids Res 47:W471–W476. https://doi.org/10.1093/NAR/GKZ390
Article Google Scholar
Bayarri G, Hospital A, Orozco M (2021) 3dRS, a web-based tool to share interactive representations of 3D biomolecular structures and molecular dynamics trajectories. Front Mol Biosci 8. https://doi.org/10.3389/FMOLB.2021.726232/FULL
Bordás B, Kömíves T, Lopata A (2003) Ligand-based computer-aided pesticide design. A review of applications of the CoMFA and CoMSIA methodologies. Pest Manag Sci 59:393–400. https://doi.org/10.1002/PS.614
Article PubMed Google Scholar
Braga R, chemistry CA-C topics in medicinal, 2013 undefined Assessing the performance of 3D pharmacophore models in virtual screening: how good are they? ingentaconnect.com
Browne R, Thomas S, in JR-C, 2021 undefined Bioinfomatics as a tool in drug designing. Wiley Online Library
Google Scholar
Burley SK, Berman HM, Kleywegt GJ, Markley JL, Nakamura H, Velankar S (2017) Protein Data Bank (PDB): the single global macromolecular structure archive. Methods Mol Biol 1607:627–641. https://doi.org/10.1007/978-1-4939-7000-1_26/COVER
Article CAS PubMed PubMed Central Google Scholar
Chakrabarty B, Parekh N (2016) NAPS: network analysis of protein structures. Nucleic Acids Res 44:W375–W382. https://doi.org/10.1093/NAR/GKW383
Article CAS PubMed PubMed Central Google Scholar
Chavda J, Bhatt H (2019) 3D-QSAR (CoMFA, CoMSIA, HQSAR and topomer CoMFA), MD simulations and molecular docking studies on purinylpyridine derivatives as B-Raf inhibitors for the treatment of melanoma cancer. Struct Chem 30:2093–2107. https://doi.org/10.1007/S11224-019-01334-9
Article CAS Google Scholar
Clevert DA, Le T, Winter R, Montanari F (2021) Img2Mol – accurate SMILES recognition from molecular graphical depictions. Chem Sci 12:14174–14181. https://doi.org/10.1039/D1SC01839F
Article CAS PubMed PubMed Central Google Scholar
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R (2013) Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 138:333. https://doi.org/10.1016/J.PHARMTHERA.2013.01.016
Article CAS PubMed PubMed Central Google Scholar
Daina A, Michielin O, reports VZ-S, 2017 undefined SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. nature.com
de Lima AM, Siqueira AS, Möller MLS, de Souza RC, Cruz JN, ARJ L, da Silva RC, DCF A, da Junior JLSGV, Gonçalves EC (2022) In silico improvement of the cyanobacterial lectin microvirin and mannose interaction. J Biomol Struct Dyn 40:1064–1073. https://doi.org/10.1080/07391102.2020.1821782
Article CAS PubMed Google Scholar
Dearden JC, Cronin MTD, Kaiser KLE (2009) How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR). SAR QSAR Environ Res 20:241–266. https://doi.org/10.1080/10629360902949567
Article CAS PubMed Google Scholar
Dolan M, Noah J, Modeling DH-H, 2011 undefined (2012) Comparison of common homology modeling algorithms: application of user-defined alignments. Springer 857:399–414. https://doi.org/10.1007/978-1-61779-588-6_18
Article CAS Google Scholar
Duke J (2020) Database of biologically active phytochemicals & their activity
Google Scholar
Dunkel M, Fullbeck M, Neumann S, Preissner R (2006) SuperNatural: a searchable database of available natural compounds. Nucleic Acids Res 34:D678–D683. https://doi.org/10.1093/NAR/GKJ132
Article CAS PubMed Google Scholar
Durrant JD, McCammon JA (2011) Molecular dynamics simulations and drug discovery. BMC Biol 9. https://doi.org/10.1186/1741-7007-9-71
Dzemyda G, Kurasova O, Medvedev V, Dzemydaitė G (2019) Visualization of data: methods, software, and applications, 295–307. https://doi.org/10.1007/978-3-030-02487-1_18
Fedyushkina I, Reyes IR, … AL-… SSB, 2014 undefined (1990) Prediction of the action of ligands of steroid hormone receptors. Springer 8:53–58. https://doi.org/10.1134/S1990750814010041
Funahashi A, Morohashi M, … HK-, 2003 undefined CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. oww-files-public.s3.amazonaws.com
Galati S, Di Stefano M, Martinelli E, … MM-IJ of, 2022 undefined (2022) VenomPred: a machine learning based platform for molecular toxicity predictions. mdpi.com. https://doi.org/10.3390/ijms23042105
Ghosh S, Datta A, Choi H (2021) multiSLIDE is a web server for exploring connected elements of biological pathways in multi-omics data. Nat Commun 12(1):1–11. https://doi.org/10.1038/s41467-021-22650-x
Grzegorzewski J, Brandhorst J, Green K, Eleftheriadou D, Duport Y, Barthorscht F, Köller A, Ke DYJ, De Angelis S, König M (2021) PK-DB: pharmacokinetics database for individualized and stratified computational modeling. Nucleic Acids Res 49:D1358–D1364. https://doi.org/10.1093/NAR/GKAA990
Article CAS PubMed Google Scholar
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 25:1315–1360. https://doi.org/10.1007/S11030-021-10217-3
Article CAS PubMed PubMed Central Google Scholar
Hayward S, Leader DP, Al-Shubailly F, Milner-White EJ (2014) Rings and ribbons in protein structures: characterization using helical parameters and Ramachandran plots for repeating dipeptides. Proteins 82:230–239. https://doi.org/10.1002/PROT.24357
Article CAS PubMed Google Scholar
Hess B, Bekker H et al (1997) LINCS: a linear constraint solver for molecular simulations. Wiley Online Library 18:1463–1472
CAS Google Scholar
Honorato-Zimmer R, Reynaert B, Vergara I, Perez-Acle T (2010) Conan: a platform for complex network analysis
Google Scholar
Horvath D (2010) Pharmacophore-based virtual screening. 261–298. https://doi.org/10.1007/978-1-60761-839-3_11
Humer C, Heberle H, Montanari F, Wolf T, Huber F, Henderson R, Heinrich J, Streit M (2022) ChemInformatics Model Explorer (CIME): exploratory analysis of chemical model explanations. J Cheminform 14:1–14. https://doi.org/10.1186/S13321-022-00600-Z/FIGURES/10
Article Google Scholar
Humphrey W, Dalke A, graphics KS-J of molecular, 1996 undefined VMD: visual molecular dynamics. Elsevier
Google Scholar
Johansson MU, Zoete V, Michielin O, Guex N (2012) Defining and searching for structural motifs using DeepView/Swiss-PdbViewer. BMC Bioinform 13. https://doi.org/10.1186/1471-2105-13-173
Kaczanowski S, Zielenkiewicz P (2010) Why similar protein sequences encode similar three-dimensional structures? Theor Chem Accounts 125:643–650. https://doi.org/10.1007/S00214-009-0656-3/FIGURES/4
Article CAS Google Scholar
Kalani K, Yadav DK, Khan F, Srivastava SK, Suri N (2012) Pharmacophore, QSAR, and ADME based semisynthesis and in vitro evaluation of ursolic acid analogs for anticancer activity. J Mol Model 18:3389–3413. https://doi.org/10.1007/S00894-011-1327-6
Article CAS PubMed Google Scholar
Kar S, Roy K, Leszczynski J (2018) Impact of pharmaceuticals on the environment: risk assessment using QSAR modeling approach. Methods Mol Biol 1800:395–443. https://doi.org/10.1007/978-1-4939-7899-1_19
Article CAS PubMed PubMed Central Google Scholar
Karami Y, Guyon F, De Vries S, reports PT-S, 2018 undefined DaReUS-Loop: accurate loop modeling using fragments from remote or unrelated proteins. nature.com
Kaushik AC, Kumar A, Bharadwaj S, Chaudhary R, Sahi S (2018) Three-dimensional (3D) pharmacophore modelling-based drug designing by computational technique. SpringerBriefs Comp Sci:27–31. https://doi.org/10.1007/978-3-319-75732-2_4
Khashan R (2012) FragVLib a free database mining software for generating “fragment-based virtual library” using pocket similarity search of ligand-receptor complexes. J Cheminform 4. https://doi.org/10.1186/1758-2946-4-18
Koumakis L, Kanterakis A, Kartsaki E, Chatzimina M, Zervakis M, Tsiknakis M, Vassou D, Kafetzopoulos D, Marias K, Moustakis V, Potamias G (2016) MinePath: mining for phenotype differential sub-paths in molecular pathways. PLoS Comput Biol 12. https://doi.org/10.1371/JOURNAL.PCBI.1005187
Lagunin A, Ivanov S, Rudik A, Filimonov D, Poroikov V (2013) DIGEP-Pred: web service for in silico prediction of drug-induced gene expression profiles based on structural formula. Bioinformatics 29:2062–2063. https://doi.org/10.1093/BIOINFORMATICS/BTT322
Article CAS PubMed Google Scholar
Lagunin A, Goel R, Gawande D, … PP-N product, 2014 undefined Chemo-and bioinformatics resources for in silico drug discovery from medicinal plants beyond their traditional use: a critical review. pubs.rsc.org. https://doi.org/10.1039/c0xx00000x
Lans I, Palacio-Rodríguez K, Cavasotto CN, Cossio P (2020) Flexi-pharma: a molecule-ranking strategy for virtual screening using pharmacophores from ligand-free conformational ensembles. J Comput Aided Mol Des 34:1063–1077. https://doi.org/10.1007/S10822-020-00329-7
Article CAS PubMed PubMed Central Google Scholar
Laplaza R, Peccati F, A. Boto R, Quan C, Carbone A, Piquemal JP, Maday Y, Contreras-García J (2021) NCIPLOT and the analysis of noncovalent interactions using the reduced density gradient. Wiley Interdiscip Rev Comput Mol Sci 11. https://doi.org/10.1002/WCMS.1497
Lei T, Li Y, Song Y, Li D, Sun H, Hou T (2016) ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminform 8:1–19. https://doi.org/10.1186/S13321-016-0117-7/TABLES/10
Article CAS Google Scholar
Lewis-Atwell T, Townsend P, Tetrahedron M-G, 2021 undefined Comparisons of different force fields in conformational analysis and searching of organic molecules: a review. Elsevier
Google Scholar
Li H, Chang Y, Lee J, … IB-N acids, 2017 undefined DynOmics: dynamics of structural proteome and beyond. academic.oup.com
Lim N, Pavlidis P (2021) Evaluation of connectivity map shows limited reproducibility in drug repositioning. Sci Reports 11(1):1–14. https://doi.org/10.1038/s41598-021-97005-z
Lipinski CA (2004) Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol 1:337–341. https://doi.org/10.1016/J.DDTEC.2004.11.007
Article CAS PubMed Google Scholar
Liu T, Naderi M, Alvin C et al (2017) Undefined (2017) break down in order to build up: decomposing small molecules for fragment-based drug design with eMolFrag. ACS Publications 57:627–631. https://doi.org/10.1021/acs.jcim.6b00596
Article CAS Google Scholar
Lu J, Bioinformatics HC-, 2016 undefined ChemTreeMap: an interactive map of biochemical similarity in molecular datasets. academic.oup.com
Lu XJ, Olson WK (2003) 3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. Nucleic Acids Res 31:5108–5121. https://doi.org/10.1093/NAR/GKG680
Article CAS PubMed PubMed Central Google Scholar
Luo W, Pant G, Bhavnasi YK, Blanchard SG, Brouwer C (2017) Pathview Web: user friendly pathway visualization and data integration. Nucleic Acids Res 45:W501–W508. https://doi.org/10.1093/NAR/GKX372
Article CAS PubMed PubMed Central Google Scholar
Magnotti EL, Moy J, Sleppy R, Carey A, Firdyiwek Y, Garrett RH, Grisham CM (2019) Developing and implementing a free online protein structure and function exploration project to teach undergraduate students macromolecular structure-function relationships. J Chem Educ 96:729–733. https://doi.org/10.1021/ACS.JCHEMED.8B00956
Article CAS Google Scholar
Malathi K, Engineering SR-B, G, 2018 undefined (2018) Bioinformatics approaches for new drug discovery: a review. Taylor & Francis 34:243–260. https://doi.org/10.1080/02648725.2018.1502984
Article CAS Google Scholar
Marvel SW, To K, Grimm FA, Wright FA, Rusyn I, Reif DM (2018) ToxPi Graphical User Interface 2.0: dynamic exploration, visualization, and sharing of integrated data models. BMC Bioinform 19. https://doi.org/10.1186/S12859-018-2089-2
Mosihuzzaman M, Choudhary MI (2008) Protocols on safety, efficacy, standardization, and documentation of herbal medicine (IUPAC technical report). Pure Appl Chem 80:2195–2230. https://doi.org/10.1351/PAC200880102195/HTML
Article CAS Google Scholar
Muzammil S, Neves Cruz J, Mumtaz R, Rasul I, Hayat S, Khan MA, Khan AM, Ijaz MU, Lima RR, Zubair M (2023) Effects of drying temperature and solvents on in vitro diabetic wound healing potential of Moringa oleifera leaf extracts. Molecules 28
Google Scholar
Pearson WR (2013) An introduction to sequence similarity (“homology”) searching. Curr Protoc Bioinformatics. https://doi.org/10.1002/0471250953.BI0301S42
Pirch S, Müller F, Iofinova E, Pazmandi J, Hütter CVR, Chiettini M, Sin C, Boztug K, Podkosova I, Kaufmann H, Menche J (2021) The VRNetzer platform enables interactive network analysis in virtual reality. Nat Commun 12(1):–14. https://doi.org/10.1038/s41467-021-22570-w
Probst D, Reymond JL (2018) SmilesDrawer: parsing and drawing SMILES-encoded molecular structures using client-side JavaScript. J Chem Inf Model 58:1–7. https://doi.org/10.1021/ACS.JCIM.7B00425/SUPPL_FILE/CI7B00425_SI_001.PDF
Article CAS PubMed Google Scholar
Rajan K, Zielesny A, Steinbeck C (2021) DECIMER 1.0: deep learning for chemical image recognition using transformers. J Cheminform 13:1–16. https://doi.org/10.1186/S13321-021-00538-8/TABLES/15
Article Google Scholar
Rego CMA, Francisco AF, Boeno CN, Paloschi MV, Lopes JA, MDS S, Santana HM, Serrath SN, Rodrigues JE, Lemos CTL, Dutra RSS, da Cruz JN, dos Santos CBR, da S. Setúbal S, Fontes MRM, Soares AM, Pires WL, Zuliani JP (2022) Inflammasome NLRP3 activation induced by Convulxin, a C-type lectin-like isolated from Crotalus durissus terrificus snake venom. Sci Rep 12:1–17. https://doi.org/10.1038/s41598-022-08735-7
Article CAS Google Scholar
Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Doǧan T (2019) Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 20:1878–1912. https://doi.org/10.1093/BIB/BBY061
Article CAS PubMed Google Scholar
Saeedipour S, Tai D, Fang J (2015) ChemCom: a software program for searching and comparing chemical libraries. J Chem Inf Model 55:1292–1296. https://doi.org/10.1021/CI500713S
Article CAS PubMed Google Scholar
Sahoo RN, Pattanaik S, Pattnaik G, Mallick S, Mohapatra R Review on the use of Molecular Docking as the First Line Tool in Drug Discovery and Development. ijpsonline.com 1334
Salomon-Ferrer R, … DC-WI, 2013 undefined (2012) An overview of the Amber biomolecular simulation package. Wiley Online Library 3:198–210. https://doi.org/10.1002/wcms.1121
Sander T, Freyss J, … M von K-J of chemical, 2015 undefined (2015) DataWarrior: an open-source program for chemistry aware data visualization and analysis. ACS Publications 55:460–473. https://doi.org/10.1021/ci500588j
Schyman P, Liu R, Desai V, Wallqvist A (2017) vNN web server for ADMET predictions. Front Pharmacol 8. https://doi.org/10.3389/FPHAR.2017.00889/FULL
Sellis D, Vlachakis D, Vlassi M (2009) Gromita: a fully integrated Graphical User Interface to Gromacs 4. Bioinform Biol Insights 3:99–102. https://doi.org/10.4137/BBI.S3207
Article CAS PubMed PubMed Central Google Scholar
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks genome.cshlp.org. https://doi.org/10.1101/gr.1239303
Singh N, Chaput L, Villoutreix BO (2021) Virtual screening web servers: designing chemical probes and drug candidates in the cyberspace. Brief Bioinform 22:1790–1818. https://doi.org/10.1093/BIB/BBAA034
Article PubMed Google Scholar
Spyropoulos IC, Liakopoulos TD, Bagos PG, Hamodrakas SJ (2004) TMRPres2D: high quality visual representation of transmembrane protein models. academic.oup.com 20:3258–3260. https://doi.org/10.1093/bioinformatics/bth358
Stasinakis P, Molecular DN-B and, 2017 undefined (2017) Modeling of DNA and protein organization levels with Cn3D software. Wiley Online Library 45:126–129. https://doi.org/10.1002/bmb.20998
Article CAS Google Scholar
Stierand K, Rarey M (2010) PoseView -- molecular interaction patterns at a glance. J Cheminform 2. https://doi.org/10.1186/1758-2946-2-S1-P50
Strasser A, Wittmann H-J (2013) Construction of ligands. Modelling of GPCRs, 29–36. https://doi.org/10.1007/978-94-007-4596-4_4
Sun X, Zhang Y, Zhou Y, Lian X, Yan L, Pan T, Jin T, Xie H, Liang Z, Qiu W, Wang J, Li Z, Zhu F, Sui X (2022) NPCDR: natural product-based drug combination and its disease-specific molecular regulation. Nucleic Acids Res 50:D1324–D1333. https://doi.org/10.1093/NAR/GKAB913
Article CAS PubMed Google Scholar
Tang Q, Nie F, Zhao Q, Bioinformatics WC-B in, 2022 undefined A merged molecular representation deep learning method for blood–brain barrier permeability prediction. academic.oup.com 2022:1–10. https://doi.org/10.1093/bib/bbac357
Tao A, Huang Y, Shinohara Y, Caylor ML, Pashikanti S, Xu D (2019) EzCADD: a rapid 2D/3D visualization-enabled web modeling environment for democratizing computer-aided drug design. J Chem Inf Model 59:18–24. https://doi.org/10.1021/ACS.JCIM.8B00633/ASSET/IMAGES/LARGE/CI-2018-006338_0006.JPEG
Article CAS PubMed Google Scholar
Tarini M, Cignoni P (2006) QuteMol
Google Scholar
Valsecchi C, Grisoni F, Motta S, … LB-T and A, 2020 undefined NURA: a curated dataset of nuclear receptor modulators. Elsevier
Google Scholar
Van Der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, Berendsen HJC (2005) GROMACS: fast, flexible, and free. J Comput Chem 26:1701–1718. https://doi.org/10.1002/JCC.20291
Article PubMed Google Scholar
Wang T, Sun J, Medicine QZ-C in B and, 2022 undefined Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism. Elsevier
Google Scholar
Weber JR (2009) ProteinShader: illustrative rendering of macromolecules. BMC Struct Biol 9. https://doi.org/10.1186/1472-6807-9-19
Weir H, Thompson K, Woodward A, Choi B, Braun A, Martínez TJ (2021) ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning. Chem Sci 12:10622–10633. https://doi.org/10.1039/D1SC02957F
Article CAS PubMed PubMed Central Google Scholar
Wink LH, Baker DL, Cole JA, Parrill AL (2019) A benchmark study of loop modeling methods applied to G protein-coupled receptors. J Comput Aided Mol Des 33:573–595. https://doi.org/10.1007/S10822-019-00196-X
Article CAS PubMed PubMed Central Google Scholar
Wong YI, Kwan G (2015) Identification of protein-ligand binding site using machine learning and hybrid pre-processing techniques
Google Scholar
Wu L, Yan B, Han J, Li R, Xiao J, … SH-NA, 2022 undefined TOXRIC: a comprehensive database of toxicological data and benchmarks. academic.oup.com
Xie T, Song S, Li S, Ouyang L, proliferation LX-C, 2015 undefined Review of natural product databases. Wiley Online Library
Google Scholar
Xie Y, Li H, Luo X, Li H, Gao Q, Zhang L, Teng Y, Zhao Q, Zuo Z, Ren J (2022) IBS 2.0: an upgraded illustrator for the visualization of biological sequences. Nucleic Acids Res 50:W420–W426. https://doi.org/10.1093/NAR/GKAC373
Article CAS PubMed PubMed Central Google Scholar
Yadav DK, Meena A, Srivastava A, Chanda D, Khan F, Chattopadhyay S (2010) Development of QSAR model for immunomodulatory activity of natural coumarinolignoids. Drug Des Devel Ther 4:173. https://doi.org/10.2147/DDDT.S10875
Article CAS PubMed PubMed Central Google Scholar
Yuan S, Chan H, Reviews ZH-WI, 2017 undefined (2017) Using PyMOL as a platform for computational drug design. Wiley Online Library 7. https://doi.org/10.1002/wcms.1298
Zhang J, Chiodini R, Badr A, Zhang G (2011) The impact of next-generation sequencing on genomics. J Genet Genomics 38:95. https://doi.org/10.1016/J.JGG.2011.02.003
Article PubMed PubMed Central Google Scholar
Zhou P, Shang Z (2009) 2D molecular graphics: a flattened world of chemistry and biology. Brief Bioinform 10:247–258. https://doi.org/10.1093/BIB/BBP013
Article CAS PubMed Google Scholar
Zhou G, Xia J (2018) OmicsNet: a web-based tool for creation and visual analysis of biological networks in 3D space. Nucleic Acids Res 46:W514–W522. https://doi.org/10.1093/NAR/GKY510
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgments

The authors are thankful to West Bengal State Government Departmental Fellowship Scheme of Jadavpur University for providing the manpower and necessary resources.

Author information

Authors and Affiliations

School of Bioscience and Engineering, Jadavpur University, Kolkata, India
Tathagata Adhikary & Piyali Basak

Authors

Tathagata Adhikary
View author publications
You can also search for this author in PubMed Google Scholar
Piyali Basak
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Adolpho Ducke Laboratory, Paraense Emilio Goeldi Museum, Belém, Pará, Brazil
Jorddy Neves Cruz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Adhikary, T., Basak, P. (2023). Software for Drug Discovery and Protein Engineering: A Comparison Between the Alternatives and Recent Advancements in Computational Biology. In: Cruz, J.N. (eds) Drug Discovery and Design Using Natural Products. Springer, Cham. https://doi.org/10.1007/978-3-031-35205-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-35205-8_9
Published: 29 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35204-1
Online ISBN: 978-3-031-35205-8
eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics

Software for Drug Discovery and Protein Engineering: A Comparison Between the Alternatives and Recent Advancements in Computational Biology

Abstract

Similar content being viewed by others

Protein binding sites for drug design

In silico Identification and Characterization of Protein-Ligand Binding Sites

Molecular Modeling Techniques and In-Silico Drug Discovery

Keywords

1 Introduction: The Need for Computational Biology

2 Visualization of Molecular Structures

3 Prediction of Pharmacokinetic/Pharmacodynamic Profile

4 Prediction of Structures Including Homology Modeling

5 Interaction Networks

6 Pharmacophore Modeling and Molecular Docking

7 Molecular Dynamics Simulation

8 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Software for Drug Discovery and Protein Engineering: A Comparison Between the Alternatives and Recent Advancements in Computational Biology

Abstract

Similar content being viewed by others

Protein binding sites for drug design

In silico Identification and Characterization of Protein-Ligand Binding Sites

Molecular Modeling Techniques and In-Silico Drug Discovery

Keywords

1 Introduction: The Need for Computational Biology

2 Visualization of Molecular Structures

3 Prediction of Pharmacokinetic/Pharmacodynamic Profile

4 Prediction of Structures Including Homology Modeling

5 Interaction Networks

6 Pharmacophore Modeling and Molecular Docking

7 Molecular Dynamics Simulation

8 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation