Abstract
Computational methods are a powerful knowledge-based a approach that helps to select plant material or natural products (NP) with an increased likelihood for biological activity. These methods enable the rationalization of biological activities of NP and contribute to putative protein-ligand binding characteristics of these molecules. In this way, focusing on information about highly ranked virtual hits from properly validated in silico models is a rationale to streamline experimental efforts. In silico approaches can focus on well-known constituents of herbal remedies as well as on any natural compound with relevant biological effects directly retrieved from the literature. They might further be helpful for the selection of promising starting material for an experimental work-up. This chapter provides a general overview to students and researchers, who will step in this emerging and exciting field of science. It gives a brief introduction into the field of cheminformatics and presents different virtual screening strategies implemented in pharmacognostic workflows to point out opportunities and challenges in NP-based drug discovery.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
1 Computational Tools in Drug Discovery
Computer-aided drug design is of emerging importance and usefulness for the development of therapeutically relevant small molecules. With recent advances in structural determination, e.g., cryo-electron microscopy (Fernandez-Leiro and Scheres 2016), we face a growing number of possible drug targets and binding positions. Likewise, combinatorial chemistry and high-throughput screening performed excessively in the last 20 years have led to an explosion in the number of small molecules and in vitro data (Gaulton et al. 2012; Yan et al. 2006; Lo et al. 2018). Although not as dramatic as in synthetic chemistry, the number of small molecules from nature and their occupied chemical space is constantly increasing by the discovery of new sources (e.g., marine and microbial organisms) and the diligent exploitation of already known material (Pye et al. 2017; Harvey et al. 2015). From this already existing large quantity of data, learning and making predictions offer the chance of rationalizing research. With the help of computational tools, extrapolation from this huge volume of data enables the prediction of new events, such as a molecule’s putative ligand-target interaction, biological activity, or its properties including metabolism, toxicology, and pharmacokinetics. The implementation of computational methods aims to concentrate the capacities for experimental testing on less but more encouraging subjects. It can thus help to focus time and money by streamlining experimental efforts (Sliwoski et al. 2014). In iterative research processes, e.g. lead optimization and bioassay-guided fractionation, it can guide research projects. Recent advances in machine learning algorithms, molecular dynamics, more accurate ADMET predictions, and the fading of the earlier computational power bottleneck brought computer-aided drug discovery (CADD) into the spotlight of research interest. Especially in the field of NP, which is generally regarded as an expensive endeavor (Strohl 2000), in silico tools can help to overcome key difficulties and fast-forward investigations as recently reviewed (Rollinger et al. 2006a, b; Rollinger 2009, Rollinger et al. 2009; Rollinger and Wolber 2011, Kaserer et al. 2018). Although delayed, their application in NP research has led to outstanding findings. However, the interface of computer-aided drug design and NP research (two multidisciplinary disciplines on their own) is highly interdisciplinary as it embraces organic chemistry and phytochemistry, informatics, structural biology, genomics, biochemistry, pharmacology, mathematics, biophysics, and medicinal chemistry besides other fields. Furthermore, there is an abundance of CADD tools suited for different applications and problems (Schneider 2010; Gasteiger 2016).
This survey shall give a general overview to students and researchers, who will step in this emerging and exciting field. It gives a brief introduction into the field of cheminformatics and further presents and explains different virtual screening (VS) approaches with a focus on opportunities and obstacles in general and in combination with NP. Finally, the chapter presents various studies embracing different in silico strategies to exemplarily show the great diversity of VS in NP drug discovery.
2 A Brief Introduction to Cheminformatics
2.1 Definition of Cheminformatics
In 1998 F. K. Brown (1998) made the striking definition: “Cheminformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and optimization.” Since then the cheminformatics field has extended and is today better and more general defined as broad field of solving chemical problems with computational methods (Gasteiger and Engel 2003). They can be of great assistance, but it is fundamental that one understands the basic concepts and fundaments of cheminformatics: the representation of chemical structures, the calculation of molecular descriptors and chemical fingerprints, and the analysis of chemical space.
2.2 Chemical Structure Formats
All cheminformatics tools are based on the representation and storage of chemical structures in a format accessible to software tools. As computers can only manage bits of 0 and 1, languages have been sought which are readable for computers and operators. They should be fast processible and have a small memory footprint but a maximum of information content. Similar to human chemical description, there are also several complexity levels for computer chemical structure exchange formats depicted in Fig. 1. For each hierarchy level, there have been plenty of language formats developed, most specialized to particular software tools and incorporating specific information. The interconversion from one to another is often difficult, sometimes impossible to perform without loss of information (Kirchmair et al. 2008). However, there are some standard structure exchange formats implemented by nearly all software applications:
-
1.
SMILES codes, developed by Daylight (O’Boyle 2012; Weininger 1988), are simple line notations of molecules, enabling memory saving storage. The atoms of a molecule are simply notated by their connections. The SMILES smiles code for cyclohexane, C1CCCCC1, is simple; for larger molecules like piperine (C1CCN(CC1)C(=O)C=CC=CC2=CC3=C(C=C2)OCO3) or strychnine (C1CN2CC3=CCO[C@@H]4CC(=O)N5[C@@H]6[C@@H]4[C@H]3C[C@@H]2C61C7=CC=CC=), it gets more complex, and some additional rules are necessary, but they still need few memory. Because of its linguistic construction, operators can easily learn this language. Moreover there are rules for canonical notations, making the canonical SMILES synonymous with a molecule. Isomeric SMILES can incorporate information about double bond geometry and chirality and should therefore be used for three-dimensional (3D) uses.
-
2.
The connection table formats (Dalby et al. 1992) mol and sdf are more memory intensive and although still text based less comprehensible for researchers (Table 1). They can incorporate additional information on the compound and support 3D information. Sdf files can store series of molfiles joined together and therefore be used for exchanging libraries of structure data, together with annotated metadata.
-
3.
The mol2 file format is also a text-based connection table format. It can store 3D small molecule representations but also large proteins and nucleic acids.
-
4.
The Protein Data Bank (PDB) format was developed for the 3D storage of biological macromolecular structures (proteins and nucleic acids). It is a text-based format with rather complicated ordering and formatting of data sections and records (record = line of information), atomic coordinates, assignment of secondary structures, and side-chain rotamers. Bonds are not specified, which makes the correct reconstruction of structures error-prone. For the atoms of amino acid standard residues, only 3D coordinates are stored, because the connectivity of the atoms is heuristic and can be looked up in implemented databases. The pdb format is the standard exchange format for proteins and other macromolecules (Berman et al. 2003; Henrick et al. 2008).
The broader basis of chemical structure formats and their processing would by far exceed this chapter, and interested ones are referred to literature (Gasteiger and Engel 2003).
2.3 Molecular Descriptors
The chemical information incorporated in a molecular structure format can be transformed into quantitative molecular descriptors, which play a fundamental role in cheminformatics (Fig. 1) (Corwin et al. 1995; Danishuddin and Khan 2016). Numerical molecular descriptors can be scalar (one-dimensional), e.g., heavy atom count and molecular weight. Two-dimensional (2D) chemical descriptors include topological indices or molecular profiles, and 3D descriptors extract their content from 3D coordinate representations, e.g., surface/volume descriptors and pharmacophore descriptors. Four-dimensional descriptors are 3D descriptors considering multiple conformations (Bajorath 2001; Karelson et al. 1996; Sliwoski et al. 2016) (Todeschini and Consonni 2008; Lo et al. 2018). Some descriptors can be obtained by standardized experimental measurements (physicochemical properties) but most of them by a mathematical calculation, which transforms chemical information present in the molecular structure into a useful, sometimes purely abstract number. The number of numerical descriptors is growing, e.g., the commercial DRAGON system can generate up to 5000 different descriptors (Sawada et al. 2014; Chavan et al. 2014). They can be computed by several software and online tools or easily defined by simple scripts. Tetko reviews some resources for molecular descriptors and tools to calculate them (Tetko 2003). The most important application for molecular descriptors is quantitative structure activity relationship (QSAR) and quantitative structure property relationship (QSPR) but also chemical similarity analysis.
2.4 Molecular Fingerprints
Molecular fingerprints can be an abstract but useful way to encode structural features of molecules. Basically they are bit strings or high-dimensional vectors, which are generated by hashing functions. The most widespread molecular fingerprints are the binary Molecular ACCes System (MACCS) keys. Binary means they only use two digits (0 and 1). For the presence or absence of one of the 166 substructures in a molecule, zeros and ones are appointed (Fig. 1). This bit set on the one hand quite well defines a molecule and on the other hand enables easier computation than a “bulky” molecular representation. The easiness of bit set fingerprints compared to complex molecular representations is the most important factor to use them for similarity searches. PubChem, e.g., uses their own PubChem substructure fingerprint with 881 fragments for similarity searching. Next to such 2D substructure fingerprints, there exists a broad palette of other fingerprints including topological (connectivity or spatial distribution of fragments), pharmacophore, text-based, protein-ligand interaction, and hybrid fingerprints. Next to similarity searching, they are useful tools for VS.
2.5 Chemical Space Analysis
Characterization of molecules with quantitative descriptors enables their arrangement in a multidimensional space. Placement of small molecules in this chemical space enables their classification and comparison. A general approach would be to calculate numerical descriptors (e.g., physicochemical or structural descriptors) and perform a principal component analysis to reduce the descriptor vector to a 2D or 3D space which can be plotted (Singh et al. 2009; Wetzel et al. 2007). Chemical space examination has grown to major relevance, since it was shown that biologically relevant small molecules occupy only small regions of the possible chemical space. While in theory 1060 small organic molecules are thinkable (Reymond et al. 2010), most of the chemical space they occupy can be seen as useless for drug discovery (Payne et al. 2006; Macarron 2006). Good news for NP researchers is the fact that NP occupy exactly this biologically important space, which is in accordance with the understanding of NP as privileged structures (Gu et al. 2013; Harvey et al. 2015). Since there is great interest in compound libraries which are focused on this drug-relevant space (Akella and DeCaprio 2010), it is aimed that screening libraries with compounds similar to NP are created (Cordier et al. 2008) and their quality is assessed with a NP likeness score (Jayaseelan et al. 2012). Chemical space analysis has high applicability, e.g., for the quality control of VS libraries, for the comparison of query molecules, or for the selection of virtual hits. A useful and comprehensible resource is the free chemGPS-NP online tool (Larsson et al. 2007). For uploaded molecules, 35 descriptors converted to 8 principal components are calculated and returned. The principal components are accountable as they apply properties like a molecule’s size, shape, or flexibility. An iterative process for the definition of PCA space avoids outliers. The chemical space analysis of chemGPS compares all molecules from a reference set to a test set (each dot stands for a molecule) (Larsson et al. 2007). If it is more appropriate to compare each molecule of a test set against the mean of a reference set, multi-fusion similarity mapping based on molecular fingerprints is a viable alternative (Medina-Franco et al. 2007; Singh et al. 2009).
2.6 Chemical Similarity Analysis
The basic principle of chemical similarity analysis is the assumption that compounds with similar structures in the narrower or the wider sense have similar biological properties, which is often but not always the case (Hu et al. 2013; Stumpfe et al. 2014; Bajorath 2017). Chemical similarity is quantified by distance/similarity metrics between fingerprints derived from molecular features (e.g., topological, physicochemical, or pharmacophore features). The most important similarity metric is probably the Tanimoto coefficient (TC). The higher the TC score, which ranges from 0 to 1, the higher the similarity between two molecules (Bajusz et al. 2015). Chemical similarity search can be used for VS, e.g., 3D shape and pharmacophore matching. Moreover, molecules can be clustered into groups based on their similarity in chemical reference space. Molecules in the same cluster are similar to each other; molecules in different clusters are thought to be different from each other. Several cluster analysis methods are available, e.g., Jarvis-Patrick (nonhierarchical) or Ward’s (hierarchical) methods; there is however no universal solution for all problems, as there is no single measure of similarity. With similarity clustering, target fishing can be sufficiently performed, e.g., features of query compound fragments are compared to pre-calculated drug compound clusters (Reker et al. 2014; Rodrigues et al. 2016). Besides this astonishing works, similarity measurements and clustering techniques are of utmost importance in cheminformatics. For additional reading, one is referred to Nikolova and Jaworska (2003).
3 Virtual Screening for Hit Generation from Natural Products
3.1 Definition
Similar to physical high-throughput screening, where a large number of compounds is tested in wet lab in any assay to identify those compounds which exert biological activities, VS is an in silico technique to virtually probe molecular libraries for those structures, which are most promising to putatively exert an activity on a focused target. It is more or less the mining of hypothetical molecule piles to identify the most promising candidates to possess a desired property (i.e., activity on a target) (Rester 2008). For evaluation of the predictive power of the used filtering tool (model), and primarily for the identification of hit compounds, the experimental testing of predicted hits is an indispensable component in the drug discovery process. VS is always a heavily knowledge-driven process and depends on the information already available for the system under investigation. The quality and amount of information and its preceding selection and preparation is imperative for successful experiments (Sichao et al. 2013). Moreover, the subjective expertise of an operator or working group should not be underestimated (Ban et al. 2017). Generally, one can classify VS into structure-based approaches with information from experimental protein structures, either from X-ray crystallography, NMR, or computational homology models, and ligand-based approaches with information on known ligands. Although usually less reliable than the structure-based techniques, the latter approach is still the method of choice for membrane-bound G-protein-coupled receptors and ion channels (Evers et al. 2005; Seidel et al. 2010). A comprehensive list of resources for VS is provided at http://www.click2drug.org.
3.2 Molecular Docking
Molecular docking is regarded by many as the central technology for VS; therefore, the programs are under intense development and evaluation (Kitchen et al. 2004; Hauser and Windshügel 2016; Cavasotto and Orry 2007; Grosdidier et al. 2011). They can filter out promising hits in a virtual database but can also give answers to related problems like prediction of binding pose and affinity (Anderson 2003; Jain and Nicholls 2008). Molecular docking is a computational approach, which aims to first virtually predict the binding of a ligand (in a specific conformation) to a target. Secondly, binding affinities of the predictions are approximated. The binding modes of the virtually screened molecules are usually ranked from estimated most active to inactive ones. Prediction of (a) the correct binding pose with search algorithms and (b) the correct estimations of binding affinities, termed as scoring, is a nontrivial task. This is why more than 60 different docking tools and programs with different search algorithms and scoring functions have been developed (Pagadala et al. 2017). As they may incorporate their own benefits and shortcomings, their performance has been evaluated several times, but the claims of superiority differ largely. Commercial software not necessarily has superior performance over open-source tools (Wang et al. 2016; Huang et al. 2010; Warren et al. 2006). According to Chen (2015), the three most frequently used programs are the freeware AutoDock (Osterberg et al. 2002) and the commercial programs GOLD (Jones et al. 1997) and Glide (Friesner et al. 2006). Whereas AutoDock uses stochastic search algorithms, GOLD is based on genetic and Glide on systematic ones (Taylor et al. 2002). Other commonly used programs are FlexX, Surflex, LigandFit, Dock, and AutoDock Vina, besides different web services like SwissDock (Grosdidier et al. 2011). It has to be emphasized that the choice of software and algorithm(s) strongly depends on the focus of the VS project. Chen has recently given a comprehensive overview on different applications and their benefits (Chen 2015). Although molecular docking is illustrative and has led to outstanding findings in drug design (Shoichet et al. 2002; Claude Cohen 2007), shortcomings independent of the software are pervasive. While docking algorithms perform quite well in sampling correct binding poses, it is still impossible to calculate the solvation effects and entropic parts of ligand binding energy, causing inaccurate scoring. Another shortcoming is the time-consuming screening as high calculation demands are necessary causing programs to balance accuracy and speed. Issues like side-chain flexibility, explicit water, as well as solvent effects and backbone movements are currently under research. Nevertheless the long list of unclear issues can impede success by causing high numbers of false-positive hits (Pagadala et al. 2017; Warren et al. 2006; El-Houri et al. 2015). Moreover, some tools are empirically trained to molecules diverging from NP, which are more flexible and have a higher molecular weight (Wetzel et al. 2007), which questions their suitability to such exercises (Rollinger and Wolber 2011). However, molecular docking bears some distinct advantages over other VS techniques, e.g., incorporation of structural and mechanistic information, which gives the possibility to not only identify novel binders and scaffolds but also novel modes of binding (Ma et al. 2011).
With the exponential increase in computer processing power, advances in elucidation of macromolecule structures, consensus and machine learning-based scoring methods (Oda et al. 2006; Wójcikowski et al. 2017), and growing understanding of intermolecular interactions that take part in the protein-ligand molecular interaction (Ren et al. 2014; Yang et al. 2015), docking will continue to play its important role in drug discovery and optimization (Wang and Zhu 2016).
3.3 Target-Based Pharmacophores
The “pharmacophore” concept is approximately 50 years old and is today defined by the International Union of Pure and Applied Chemistry as “the ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target structure and to trigger (or to block) its biological response” (Wermuth et al. 1998). The concept states that the complexity of ligand-target interactions can be reduced to a distinct abstract blueprint, a 3D arrangement of the most important interaction types and their proximity and angle between each other. Ligands that comprise similar combinations of pharmacophore features in similar spatial orientation are likely to have similar activity towards a biological macromolecule. That is why query pharmacophore models can act as proficient VS filters (Van Drie 2010; Langer and Wolber 2004). Figure 2 shows an illustrating example of such pharmacophore models. The interaction types are commonly named pharmacophore features and comprise the chemical characteristics of functional groups participating in the ligand-target interactions. Different types of occurring interactions are classified into a set of a few pharmacophore features like charged group features for ionic interactions, hydrogen bond donor and acceptor features for hydrogen bonds, aromatic features for π-π and π-cation interactions, and lipophilic features for Van der Waal’s interactions between receptor and ligand. Steric constraints can be simulated by negative features, so-called exclusion volumes. Other pharmacophore features including metal complexing, ring, covalent, and halogen bonding features may be implemented additionally in some tools. There are several software packages for pharmacophore modeling and VS, most importantly LigandScout, Phase, Catalyst, and MOE (Seidel et al. 2010). As there is no golden algorithm for alignment, a previously performed comparative VS experiment recognized a high variance between the hit lists obtained from Phase, Catalyst, and MOE (Spitzer et al. 2010). Query pharmacophore models can be created by two approaches. With a target-based approach, the pharmacophore is extrapolated from experimental structures of protein targets or homology models. If there is no target structure available, the ligand-based approach can still represent a reasonable method by extracting information on known ligands. Because the target-based approach is built on observed complementarities between ligand and target, they better incorporate directionality of binding-site interactions like hydrogen bonds. Furthermore, a quite rationale modeling of steric hindrances in the binding pocket can be simulated by placing exclusion volume spheres. As soon as a pharmacophore model is generated, it is of utmost importance that it is theoretically validated by prospective VS of a set of known ligands and inactive molecules. In the optimum case, the model should discard inactive molecules and find all of the active molecules of the training set. Model refinement can be achieved by adding or deleting features and changing their tolerance or weight. Nevertheless, when generation of a single restrictive model is not possible, it is preferable to perform the VS experiment with a set of highly specific local models rather than with one global model. This approach usually results in lower false-positive hit rates. A single model would only have the ability to describe the whole ligand diversity, if it is clear that all ligands bind at the same binding site with the same molecular interactions, whereby crystal water and dynamics are negligible (Schuster et al. 2010). The pharmacophore model may always suffer from a bias toward its input information or its validation set. The only way to assess its predictive power is to validate it in vitro, preferably by target or binding assays. Cellular assays can also be suitable (sometimes they are the only possible way), but possible off-target effects may impede the significance of the experimental results and in turn their validity for model validation.
3.4 Ligand-Based Pharmacophores
As mentioned before, proficient pharmacophore models can be created without information on the target structure. More than half of all small molecule drugs act on G-protein-coupled receptors or ion channels (Santos et al. 2016), where protein structures are rare (Hauser et al. 2017), although the number is growing (Pándy-Szekeres et al. 2018). Only around one third of all projects in pharmaceutical research can rely on X-ray structural target information and another third on homology structural target information (Scior et al. 2012). It is therefore an approach with great potential and a wide range of success stories (Vuorinen et al. 2014; Acharya et al. 2011; Ha et al. 2015; Evers et al. 2005; Kratz et al. 2014; Kirchweger et al. 2018). The principle of this attempt is the alignment of different conformations of active molecules in order to visualize common electrochemical features, which might be necessary for target inhibition/activation assuming that suggested groups within the molecules trigger their biological action. The most crucial step of the method is therefore the selection of the training set. These molecules should be potent, small, rigid and most importantly should all have the same binding position. Without an experimental target structure, such information is normally absent. Therefore, it is a good way to cluster molecule sets based on their superficial pharmacophores. LigandScout’s (http://www.inteligand.com) implemented pharmacophore radial distribution function (RDF) code similarity clustering tool is an unsupervised, fast, and efficient tool to do so (Goldmann et al. 2015). Still, also an operator can quickly get an idea of the pharmacophore feature patterns in a dataset and investigate it by conformational sampling and alignment. From different alignments of the molecule set (molecular superpositioning), geometrically overlapping features can then be extrapolated as a pharmacophore model. This is achieved by an operator or automatically by different software solutions. The conformer generation (the calculation of possible binding conformations from 2D represented molecules) of the training set is a further crucible step. Just thinking of a supposed molecule with three rotatable bonds: If every rotatable bond is sampled in 10° intervals, it would give rise to 46.656 conformations. Some software tools are able to reliably suggest conformations, e.g., as experimentally observed in co-crystals, and are therefore recommended for this purpose (Ebejer et al. 2012; Friedrich et al. 2017). When the pharmacophore model (set) performs well in the theoretical validation, it is used as a query for VS. A recent study by Karaboga and coworkers highlighted the predictive power of pharmacophore-based VS by comparing different VS approaches (Karaboga et al. 2013). The biggest advantages can be summarized as:
-
1.
Their capacity of reducing the immense complexity of ligand-target interactions to an uncomplicated but illustrating model.
-
2.
By just aligning pre-generated conformational libraries to the models, fast VS is achieved.
-
3.
The simplicity of the models allows the identification of ligands with deviating structures from the training molecules. So-called scaffold hopping can be achieved.
Next to the establishment of the models as VS queries, pharmacophores can be used as descriptors for QSAR studies, as similarity metrics for machine learning approaches (Schneider and Schneider 2017) or to pre-filter large compound libraries for molecular docking.
3.5 Molecular Dynamic Simulations in Structure-Based Virtual Screening
The main problem of the structure-based VS approach is the assumption of a rigid lock-and-key binding theory mainly derived from X-ray co-crystal data, e.g., from the Protein Data Bank (PDB) (Berman et al. 2000, 2003). Although reflecting an experimental finding, it completely ignores flexibility and dynamics of proteins and protein-ligand complexes. A co-crystallized X-ray structure is not more than a snapshot of a ligand-protein system, taken at unphysiological conditions, and it therefore does not depict reality. To some extent, this may be one reason why PDB structure-based pharmacophore approaches are likely to fail in four of ten cases (Wieder et al. 2017). The new paradigm in our understanding of ligand-target interaction is the dynamic induced fit of a ligand to a target. It is aimed to implement this into structure-based VS tools and to introduce such dynamics in structure-based pharmacophore models (Sperandio et al. 2010; Bock et al. 2016; Sohn et al. 2013; Spyrakis et al. 2015) and molecular docking (Makeneni et al. 2018; Campbell et al. 2014; Sabbadin et al. 2014; Liu and Kokubo 2017). Molecular dynamic (MD) simulations are hereby a valuable method to link protein structure and dynamics and help to analyze conformational changes and allosteric modulations. Basically MD is a physics-based method for studying the interaction and motion of atoms according to Newton’s law of motion. This allows the atoms and molecules to interact for a fixed period of time, giving a view on the dynamic evolution of the system. The simulation of this system of interacting particles is defined by numerically computed Newton’s equations of motions. Involved forces and potential energies are calculated by force fields. In theory one could simply simulate the association of a ligand with a protein (Adcock and McCammon 2006). Indeed there are attempts to do so, but usually computational constraints only allow simulations on time scales in the low nanosecond scale, with specialized hardware even milliseconds (Feig and Sugita 2013). However, making a molecular dynamics simulation with a co-crystallized complex can show side-chain rotations and frequently occurring hydrogen bonds or visualize the stability of the complex. How the information obtained from those trajectories can be implemented into models and what else can be learned are a topic of current interest well outlined in a recent review by De Vivo and coworkers (De Vivo et al. 2016). MD can be used to score free energies of protein-ligand complexes (Gumbart et al. 2013; Rastelli et al. 2009), study the role of bridging water molecules in ligand binding (Sabbadin et al. 2014), visualize ligand unbinding to estimate mean residence times (Mollica et al. 2015), or discover new hidden binding pockets for allosteric activation (Bowman et al. 2015). Further improvements in VS experiments can be achieved, e.g., by ensemble docking. Thereby ligands are docked into several simulated but rigid protein conformations, and the results from single screens are then merged. This approach is especially helpful for the identification of ligands of proteins, which function is strongly connected to structural flexibility (Pang and Kozikowski 1994; Tarcsay et al. 2013; Tian et al. 2014). Pharmacophores can be created using snapshots of the molecular dynamic trajectory or by extracting feature densities (dynophores) (Mortier et al. 2017; Bock et al. 2016). Another unbiased method is the common hits approach, where the whole dataset is screened against a large number of single molecular dynamic derived pharmacophores. The virtual hits are then ranked based on their hit count without user intersection (Wieder et al. 2017).
3.6 Shape-Based Screening
A VS technique based purely on ligand information is shape-based screening . The 3D Gaussian shape of a set of molecules is calculated and compared to the shape of a known active query molecule and ranked according to the 3D shape similarity (Fig. 3). This can be achieved by finding and quantifying the maximal overlap of the query molecule’s volume with that of the screened molecules (Rush et al. 2005; Hawkins et al. 2007). The similarity of two shapes is usually quantified with Tanimoto or Tversky scores. ROCS, one of the most powerful shape-matching tools (Shin et al. 2015), offers combinations of shape and pharmacophore features for similarity assessment. Molecular shape is a fundamentally important feature of small molecules, which can not only predict activities on certain targets but also absorption, distribution, metabolism, and excretion (Kortagere et al. 2009). Shape comparison solely has proven to be a useful method for certain projects (Grienke et al. 2014). The advantages are the ease of use, the traceability of results, the likelihood of scaffold hopping (although pharmacophores are hereby the gold standard (Hessler and Baringhaus 2010)), the independence of available target structures, and the rapid screening speed. These advantages reveal it as a well-established method for virtual high-throughput screening. However, shape-based screening has several pitfalls, which should not be neglected: Typically, the bioactive conformation is not known without a co-crystallized target complex. We also know from X-ray structures of different molecules in the active site of the same target that the shape overlap between them is not always as high as assumed by shape-based VS tools. Similar to ligand-based pharmacophores, different ligands may bind to different regions in the same protein or in the same binding site. This causes quite high uncertainty, and VS entirely based on shape comparison does not perform well, since physicochemical and pharmacophore properties are neglected. However the method can be combined with other screening methods, e.g., using the shape-matching score for rescoring hit lists obtained from molecular docking or pharmacophore-based VS. Moreover 3D shape descriptors as similarity metrics are powerful descriptors for QSAR, QSPR, and machine learning models not only for predicting biological activity but also metabolism, e.g., drug metabolism (Kirchmair et al. 2015).
3.7 Virtual Parallel Screening
Due to the fast alignment-based procedure of pharmacophore screening, individual (low-energetic) conformers of a molecule but also libraries of molecule conformers can be matched with a set of pharmacophores, which represent different targets or protein isoforms (Steindl et al. 2006a, b) Thus, this approach can be used for:
-
1.
Virtual target fishing: Conformers of isolated NP are screened against a set of validated pharmacophore models representing drug targets in order to determine those promising for testing. It is applied to identify bioactivities for recently isolated and novel NP (with resolved structure). Targets for observed phenotypic effects of extracts with multiple known constituents can be predicted by screening against a set of target models potentially influential to the assay outcomes. This may guide the targeted isolation and testing of specific compounds on predicted drug targets (Rollinger 2009; Rollinger et al. 2009).
-
2.
Bringing traditionally used herbal remedies on a molecular basis: Herbal remedies usually contain hundreds of constituents. The challenge consists in identifying those compounds mainly contributing to a beneficial effect of the used remedy. On the other hand, the aim is to identify the involved targets and underlying mechanisms of action. Virtual parallel screening may be tremendously helpful to explore yet undiscovered biological actions in this framework (Grienke et al. 2015).
-
3.
Prediction of side effects: Potential side effects like cardiotoxicity, unfavorable cytochrome metabolism, nausea, and psychotic symptoms or anticholinergic effects of herbal remedies, dietary supplements, or single molecules can be predicted and assessed in a rationale manner (Klabunde and Evers 2005; Adhami et al. 2012; Kratz et al. 2014; Kratz et al. 2017; Hochleitner et al. 2017) .
-
4.
Polypharmacological profiling: A specific modulation profile is crucial for the safety and therapeutic usefulness of drug groups like kinase inhibitors or dopamine receptor modulators. Therefore a broad set of selective inhibition/activation models for the related targets is applied for virtual parallel screening. In this way the obtained predictions contribute to a fast and rationale selection of molecules for testing (Malo et al. 2010).
Independent of the approach, a few prerequisites should however be considered:
-
1.
Usage of published, experimentally validated models is recommended since most pharmacophore models are only validated theoretically on molecule sets lacking diversity. Their performance on novel molecules is therefore doubtful.
-
2.
Depending on the intentions of virtual parallel screening, the specificity of the models should be appropriate. For an off-target, like the hERG potassium channel, models with high selectivity are favorable as a higher number of false positives are acceptable in order to not miss real actives. Vice versa for target fishing, models with high specificity are preferable.
-
3.
It is important to have some information on the models. What was their enrichment factor in the theoretical and in the experimental validation? Are the applied models local or global? Which software was used to generate the models? Which conformer generator was used for the query molecules; is it the same as library generation? What was the query crystal structure or was it ligand based? All this information may be important for the interpretation of the results.
3D pharmacophore modeling is one of the most important virtual parallel screening approaches as it combines comprehensive and transparent results with fast screening. Virtual parallel screening can also be performed with molecular docking (Chen et al. 2003; Wang et al. 2012). This reverse docking approach is illustrated in Fig. 4. However for large-scale screening including side-chain flexibility, a high-performance computing cluster may be necessary. The fact that target identification for novel NP is a common obstacle, difficult to overcome with other methods, highlights the importance of this approach. Moreover we currently experience a paradigm shift from a one-drug-one-target to a polypharmacologic one-drug-many-target model for drug discovery. This is in particular the case for chronic diseases, such as chronic inflammation (Koeberle and Werz 2014). That NP bear great potential for the development of curing agents in these disease areas can be presumed.
3.8 Machine Learning
Machine learning is a broad term for computer programs, which mine useful knowledge for drug discovery from molecular structures. It is one of the most dynamic topics in computer-aided drug discovery (Varnek and Baskin 2012). One cause is the onset of big data in cheminformatics. Different to previous methods, machine learning can be easily scaled up to big datasets. They aim to identify patterns in empirical datasets to generate mathematical relationships which can be extrapolated to predict properties of novel compounds. One important application is QSAR and QSPR. Artificial intelligence is hereby used to predict how chemical modifications might influence activity or biological properties like toxicity, carcinogenesis, or metabolism. QSAR is based on the pioneering works of Hansch and Free-Wilson, who used multivariate regression models to create mathematical formulas which correlate activity to molecular properties like lipophilicity (Hansch et al. 1962; Free and Wilson 1964). There exists a broad range of methods: multiple regression analysis, support vector machines, principal component analysis, hierarchical cluster analysis, decision trees, random forest, and k-nearest neighbor besides artificial neural networks just to name few of them. They can be classified into supervised and unsupervised learning. The workflow can be broken down to the extraction of molecular features (e.g., topological descriptors, physicochemical properties, pharmacophores) from a training set, creation of molecular fingerprints from those features, similarity comparison for shared/different features (supervised or unsupervised), generation of models based on observations in the training set, and finally validation of the obtained models with test sets (Lavecchia 2015). Resulting models can be powerful tools: Reker and coworkers (Reker et al. 2014) as well as Schneider and Schneider (2017) used the neural network technique of self-organizing maps to make target predictions for new molecular entities with good results. With the advent of big data and deep learning, there is also reemerging interest in neural network algorithms for solving chemoinformatic tasks. Although shallow neural network approaches are used in drug research for a long time, newer advances with stunning performance have prompted many to consider it as a game-changing technology for VS (Gawehn et al. 2016; Pereira et al. 2016; Schneider 2017; Balaban 1997).
4 Limitations and Caveats in VS
Although there are abundant reports on successful VS application examples, also for the discovery of novel ligands from nature, awareness should be prevalent that virtual hits are just predictions and no method regardless of complexity and rationality is bullet proof. Moreover, there are certain caveats and limitations pervasive. Mistakes can happen to researchers, originally not coming from a computational chemistry-related field, which are sometimes easily circumvented as far as awareness is given. A review of Scior and coworkers addresses these “pitfalls” and is highly recommended for further reading on this topic (Scior et al. 2012). This chapter will give a brief overview on the ten most important caveats:
-
1.
An in silico experiment strongly depends on its input information. The quality should therefore be controlled and prepared in detail. Macromolecular and ligand structures from the PDB entries (Berman et al. 2000, 2003) are not error-free. They can be partially incomplete with missing atoms and residues (Brandt et al. 2008), binding pockets and ligands may not fit to their experimentally determined electron density, asparagine and glutamine rotamers can be incorrectly assigned, and irregular protonation states of ionizable residues and ligands can occur. Open-source databases like PubChem (Kim et al. 2016) and ChEMBL (Gaulton et al. 2012; Bento et al. 2014) offer a large quantity of information, but it should be corrected with the original literature for correct conformations, annotated bioactivities, and other errors.
-
2.
Generally, there is only limited information available on inactive molecules. Although lots of inhibitors/activators have been reported, only few inactives are published for certain targets. In such cases, it may be possible to generate decoys for model validation. In contrast to inactives, which are experimentally confirmed to be inactive, decoys are drug-like molecules sharing similarity with the active molecules but have never been tested. The DUDe web service (Mysinger et al. 2012) appoints decoys with similar physicochemical but deviant topological properties for active molecules. The topological differences should minimize the possibility of retrieving real active molecules as decoys. Similar physicochemical properties guarantee challenging decoys for the validation of molecular docking settings and pharmacophore models.
-
3.
As described in Chap. 5, there are several molecules, which give frequently false-positive results in bioactivity measurements. If such a molecule is employed as training compound or part of the validation set, it may destroy the predictive power of the query. Applying substructure filters to the training and validation sets as well as checking original publications for inaccuracies can help prevent this pitfall.
-
4.
The data obtained from literature can be difficult to compare, as it is generated with different assays, cell lines, and working groups or deviates in protocol settings (incubation time, readout, etc.). Highly potent activators for model validation or model generation are not always easy to identify in such noise. In such cases, it is the best approach to simply discard all molecules endowed with unreliability and only focus on compounds reported frequently as they are acknowledged positive controls, clinical candidates, or drugs. In some cases, subjective picking can be the only way to classify the molecules.
-
5.
Assessing the performances and comparison of several screening methods or models is crucial to identify the most suitable. Comparison of VS protocols and models is difficult. There are general quality metrics used for VS like specificity, sensitivity, accuracy, enrichment of actives, ROCS curves, area under the curve, and BEDROC curves. Most of these metrics are dependent on the molecule sets for validation, especially on the ratio of active to inactive molecules (Kirchmair et al. 2009; Sheridan 2008). Truchon and Bayly gave practical recommendations on their use (Truchon and Bayly 2007).
-
6.
Most 3D VS methods are dependent on conformer generators with the ability to identify bioactive molecule conformations. The performance of several conformer generators was recently assessed, and the conformer generators Omega, ConfGenX, and ICon were recommended. The best performing open-source tool according to the study of Friedrich et al. (2017) is RDKit.
-
7.
There is an abundance of different structure formats, and it is necessary to read in, read out, or interconvert one to another. But because of format incompatibilities, information can get distorted. The correctness of atomic coordinates, handling of aromatic moieties, chirality, hybridization and protonation can cause problems and make manual correction necessary (Kirchmair et al. 2008).
-
8.
Ligand-based but also structure-based modeling is always partly based on assumptions. There are multiple or allosteric binding pockets. Selected hits may be active in the experimental confirmation, but they may also bind to alternative binding pockets. It is generally rare to address the question, whether the hits identified by VS really bind in the pose predicted by docking or in the ligand binding interaction mimicked by a pharmacophore model; or it was simply a discovery by serendipity, stating that the method is not as proficient as supposed.
-
9.
Ligand-based VS, whether shape-based or shape-focused screening and to less extent pharmacophores, assume a large overlap of ligands in the binding pocket. However the observed overlap in X-ray crystal structures is not always as large as one would assume. Proper theoretical model validation and a set of local models instead of a global one can avoid this pitfall (Schuster et al. 2010; Scior et al. 2012).
-
10.
New highly potent lead-like compounds are strived for. Hits identified by VS are generally less active than the query molecules. Accordingly operators (and reviewers) should therefore not expect activities of the virtual hits, which are higher than those of the query structures. The main goal should rather be to generate several novel and structural diverse hits as starting points for further research. Zhu and coworkers critically reviewed the literature and gave recommendations among others on hit identification criteria (Zhu et al. 2013).
However if the obstacles and recommendations are considered, VS can be a powerful tool. As an example, Doman and coworkers reported a random screening hit rate of 0.02% for protein tyrosine phosphatase inhibitors, while a screening of virtually predicted hits yielded a hit rate of 34.8% (Doman et al. 2002).
5 Special Considerations and Application Examples of VS of Natural Products
The straightforward applicability of many computational tools prompted NP scientists to implement them into their research. Vice versa the privileged structures and their metabolite likeness caused computational chemists to screen and work with NP libraries. Obstacles but also opportunities arise on the interface of these two scientific disciplines and should be considered.
5.1 Quality and Availability of Resources
The availability of virtual databases comprising structurally and stereochemically well-defined compounds is necessary. Next to proper commercial NP databases like the dictionary of natural products (http://dnp.chemnetbase.com), several partially smaller open-source libraries exist. However, most virtual hits from VS studies are not physically available or affordable. Only approximately 10% of all molecules in NP libraries can be obtained commercially. The resources for the computer-guided discovery of bioactive NP have been reviewed recently (Chen et al. 2017). Virtual databases of in-house compound libraries can moreover be of great value. Physically available compounds for testing can be easily drawn and compiled to small databases in usable formats.
The quality of some virtual libraries may be uncertain. Wrong structure determination or erroneous appointment of conformations can be apparent and should be considered. Proper work-up of the virtual hit list and comparison of hits with the literature can clear up reservations. When creating an own virtual library, special attention should be given to the stereochemistry. Grienke et al. created their own virtual library of 279 constituents of Ganoderma lucidum to predict the molecular mechanism of the antiviral and metabolic activity of this TCM drug (Grienke et al. 2015).
Computational methodologies strongly rely on the experience from studies on synthetic molecules, which were used to train algorithms. NP generally have a deviating architecture, e.g., more unsaturated bonds, higher flexibility, and oxygen content but less halogens and more fused rings. Proper theoretical validation of workflows and models prior to their application should be seen as mandatory (Rollinger and Wolber 2011). However, as discussed before, it is exactly their fascinating molecular architecture, which determines them as bioactive agents and researchers can be comforted to work with them.
5.2 Postprocessing of a NP Virtual Screening Hit List
As a result of a prospective VS, a hit list comprised of dozens to thousands of small molecules (which are according to the VS filter likely to exert a biological response) is retrieved. The prioritization and selection of hits for experimental evaluation is a vital part of the VS process, especially for hit generation from NP, where pure compounds are precious, since they are rarely available (or affordable) from commercial suppliers (Fig. 5). The pure substances may have to be isolated from natural starting material with enormous efforts, and the obtainment issue is not the only challenge faced with a natural product hit list. However, classical pharmacognostic know-how (i.e., on the traditional use of herbal preparations from which the compounds were originated) can assist the selection process and may increase the success rate of finding bioactive molecules. Several considerations and tools exist, which can assist this process. Screening hit selection of NP has to be performed with utmost care and can be a complex process. On the one hand, medicinal chemistry (scaffold diversity, prediction of assay interference) as well as pharmacognostic considerations should influence the decision. For achieving a chemical diversity, hit lists can be clustered into groups (e.g., hierarchical cluster analysis). Another approach is to prioritize the virtual hits based on QSAR models or similarity (shape comparison) to query molecules. Substructure, physicochemical, and machine learning filters can be applied to scale down the hit list and clear out drug unlike molecules and pan-assay interfering substances (PAINS). The best example for a physicochemical property filter is the Lipinski’s rule of five (Lipinski et al. 2001). Next to Lipinski’s rule of five, the Veber rules (Veber et al. 2002), Ghose filter (Ghose et al. 1999), as well as more specific rules like the blood-brain barrier rule (Pardridge 2005) should be taken into account depending on the target to be reached. Especially for testing in-cell targets, cell permeability should be guaranteed. Although the downsizing of a hit list in such a way is a very rational approach, it should be considered that many agent classes such as antibiotics or molecules targeting protein-protein interactions routinely fall outside the constraint of these rules. Moreover NP-derived small molecules are elaborated in aqueous solution and are perceived as drug-like all along. Therefore, it can be worth looking outside the scope of such general drug-likeness. Next to drug-likeness, NP often contain reactive groups and have high lipophilicity, autofluorescence, or other undesirable functionalities making them prone to in vitro assay interference and, in turn, false-positive results. Broadly distributed PAINS motifs in NP are catechol, hydroquinones, epoxide and peroxide bridges, phenolic Mannich bases, and others (Baell 2016). Although several filters are widespread for sorting out PAINS and are able to process thousands of compounds within seconds, such black box treatment is under criticism. Control experiments to check for, e.g., aggregation or fluorescence interferences, are then mandatory for proof of concept. Substructure filters as, for example, contained in the Faf4drugs online tool (Lagorce et al. 2017) are useful to make aware of suspicious compounds in your hit lists. Nevertheless, if virtual hits are promising molecules, e.g., in terms of ranking, scaffold diversity, ethnobotanical use, or simply availability, a PAINS alert should not be a no-go as long as one is prepared to perform control experiments. Mitoxantrone, rifampicin, cephalosporin, and artemisinin would have never passed a PAINS filter but turned out to be valuable drugs.
Kirchweger et al. used pharmacophore-based VS experiments to identify novel ligands for the GPBAR1 (Kirchweger et al. 2018). The obtained hit list with more than 1000 compounds was re-ranked with the TC score obtained from shape-focused screening. Diversity was obtained by physicochemical clustering using principal components obtained by chemGPS. In the final selection, substructure and PAINS filters were included flexible, which led to the identification of sesquiterpene coumarins and triterpenes as high-efficacy ligands for this bile acid receptor.
Su and coworker performed a VS approach for novel inhibitors of the Rho kinase. Prior to the VS, they cleaned and focused their library with fingerprint clustering and drug-likeness filters. The VS hit list obtained by molecular docking was scaled down by re-docking and a QSAR-based scoring function, specially developed to kinase inhibitors. Only 6 from the top 100 ranked virtual hits were subjected to experimental validation, which led to the identification of phloretin and baicalein showing IC50s in the nanomolar range (Su et al. 2015).
5.3 Selection of Compounds and Natural Starting Material
The screening of herbal extracts is seen as dirty and expensive. Isolation, fractionation, dereplication, and characterization are labor- and time-intensive. Complexity of multicomponent mixtures, aggregation, assay interference, and instability of possible constituents make it questionable if the active principles can be identified. Computational methods can bypass many of these steps in testing specific compounds in silico and guide the selection of compounds or proper natural starting material for experimental investigation. After computational prediction and postprocessing of the hit list, several questions have to be answered. Reports from medicinal usage, phenotypic or in vivo experiments without molecular modes of action, are apparent for some NP and herbal preparations. Several successful projects have previously shown that extracting that knowledge and using it for decision-making enrich the outcomes (Rollinger et al. 2004, 2006b, 2008; Kratz et al. 2016; Waltenberger et al. 2016). The virtual hit itself may be obtained in sufficient purity from commercial suppliers, or the secondary metabolite has to be re-isolated from a reported natural source. In this case literature on its isolation should be available. The natural starting material should be accessible and legally available for collection/acquisition considering issues on bioprospecting, intellectual property rights, and transfer of natural material from outside (Nagoya protocol, (Matthias and Clare 2011)). Considerations on detection, dereplication, and targeted isolation methods of the virtual hit and analogs from an extract can influence decision-making. Finally, virtual hits and congeners must be able to get isolated in an adequate time, amount, and purity.
In their attempt to assess the cardiotoxic risk by hERG channel blockage of commonly consumed NP, Kratz et al. screened a 3D multiconformational NP database comprising 130,000 molecules against a validated pharmacophore model set (Kratz et al. 2014). The majority of virtual hits have identified as constituents of 12 often used medicinal plant genera. Small-scale lead-like enhanced extracts from these plant materials were prepared and tested in a patch clamp assay. Thereof, four plant extracts showed potent hERG inhibition, among them Ipecacuanhae radix. Preparations of this antiemetic traditional medicine are easily available OTC products and underline the necessity for systematic risk assessment by antitarget identification (Kratz et al. 2016).
For comparative analysis of the performance of different bioactivity detection tools, Rollinger et al. (2005) used both, pharmacophore-based VS and a classical bioassay-guided strategy for the identification of new cyclooxygenase inhibitors. Since different Diels-Alder adducts from Sang Bai Pi, the root bark of Morus alba , were predicted by pharmacophore-based VS, the methanolic extract of this traditionally used herbal drug was examined. Bioassay-guided isolation then led to the isolation and identification of nine COX-inhibitors, the majority of them have been predicted in the VS approach (Rollinger et al. 2004).
For the discovery of natural FXR activators, a set of validated pharmacophore models (Schuster et al. 2011) were used as queries for the VS of a Chinese Herbal Medicine (CHM) database comprising 10,000 compounds from traditional Chinese medicine. Because several triterpenes known from the fruit body of Ganoderma lucidum were found in the virtual hit list, extracts of this medicinal mushroom were prepared and proved to be active in the experimental setup. Phytochemical work-up then led to the identification of five lanostane triterpenes which induced FXR activation in the low micromolar range (Grienke et al. 2011).
5.4 Find Molecular Targets for Novel Compounds
The approach-inherent incapability to find novel compounds constitutes a clear limitation of VS studies. This however can be circumvented, when prior to VS a phytochemical investigation is performed to eventually isolate probably new compounds from given natural starting material. VS is then applied to all (also new) isolated and structurally identified molecular entities for the elucidation of potentially hit targets (Rollinger 2009). NP are often multi-target compounds, and the assessment of the whole target ensemble is an almost impossible task. Although only hypothetical and limited to known and structurally defined target proteins, VS may give a clue for finding the involved molecular mechanism by in silico target prediction. This strategy was previously exemplified for 16 constituents isolated from the aerial parts of the medicinal plant Ruta graveolens (Rollinger et al. 2009). The small compound collection was virtually screened against a set of 2208 pharmacophore models, which helped to identify novel inhibitors of acetylcholinesterase, the human rhinovirus coat protein, and the cannabinoid receptor type 2.
Schneider and Schneider (2017), for example, used their Target Interference Generator (TIGER) software to find protein targets for the new molecular entity marinopyrrole A. TIGER uses topological pharmacophore similarities between the subject and a set of reference compounds with 331 known targets. Four of six predicted and tested targets were experimentally validated, among them the glucocorticoid receptor with a KB of 0.7 μM.
Gong et al. (2014) used a reverse docking approach against 211 cancer-related targets to unravel the phenotypic cytotoxic effect of two novel sponge isolates. Only the ten best ranked targets were tested and led to the discovery of novel h(p300) inhibitors.
5.5 Elucidation of Natural Product Molecular Binding Mechanism
Docking and MD simulations besides other methods can accurately predict the binding mode of NP to their respective targets offering valuable support for the understanding of bioactivities on a molecular level.
For example, Fu and coworkers identified the binding mode of xanthohumol to the anti-inflammatory target myeloid differentiation protein 2 with docking experiments. They proved their in silico prediction by MD simulation of the proposed complex. The authors verified their hypothesis in vitro by surface plasmon resonance experiments and an enzyme-linked immunosorbent assay with MD-2 mutants, whose predicted key residues for hydrogen bond formation were transformed into alanine (Fu et al. 2016).
Atanasov et al. (2013) found polyacetylenes from Notopterygium incisum as PPARγ ligands using an ethnobotanical screening. To elucidate their binding mode, docking experiments against a magnolol-bound PPARγ structure was performed. Although not experimentally validated, the results suggested a similar binding mode and highlighted key residues in the ligand-target recognition.
DNA intercalation is a common mode of action for NP, and Mulholland and Wu (2016) performed an extensive research on the dynamics of this process. They investigated in silico the binding mechanism of telomestatin, a Streptomyces isolate which induces apoptosis in cancer cells, to telomeric G-quadruplex DNA. Although previous docking studies revealed a plausible binding mode, they gave no detailed information on the process of binding. One millisecond molecular binding simulations showed the formation of three stable binding poses. Next to these binding poses, the authors also showed the dynamics of DNA intercalation and observed interconversion of one to another pose.
6 Conclusions
The here presented studies can only give a limited insight into research applications, which have been published in the last years. Although NP scientists all over the world strive through thousands and thousands of extracts and their isolates, many NP remain to be discovered. In particular, the search for new sources, e.g., in the field of microbes and marine organisms, is a renewed area of interest in NP research and may provide new chemical scaffolds. These explorations are mandatory in the light of enriching our pool of NP diversity. Additionally, historical information from traditional medicine and findings from observational studies on the one hand, and the increasing knowledge we observe in structural and biological data from new chemical entities, macromolecular targets and their physiological role in humans on the other hand provide an infinite source of data. Combining information derived from all these heterogeneous sources, structuring big data, and not getting lost within it will be a future challenge in our society. VS experiments can derive a maximum benefit from this increase of life science data and thereby strengthen its way in NP drug discovery. However, awareness concerning data reliability and a critical view on and an unbiased attitude toward predicted results are indispensable prerequisites for successful projects. Considering its limits and pitfalls and exploiting its potential, VS will successfully guide future studies and thereby augment our knowledge on bioactive natural lead structures.
References
Acharya C, Coop A, Polli JE et al (2011) Recent advances in ligand-based drug design: relevance and utility of the conformationally sampled pharmacophore approach. Curr Comput Aided Drug Des 7:10–22
Adcock SA, Mccammon JA (2006) Molecular dynamics: survey of methods for simulating the activity of proteins. Chem Rev 106:1589–1615
Adhami HR, Linder T, Kaehlig H et al (2012) Catechol alkenyls from Semecarpus anacardium: acetylcholinesterase inhibition and binding mode predictions. J Ethnopharmacol 139(1):142–8
Akella LB, Decaprio D (2010) Cheminformatics approaches to analyze diversity in compound screening libraries. Curr Opin Chem Biol 14:325–330
Anderson AC (2003) The process of structure-based drug design. Chem Biol 10:787–797
Atanasov AG, Blunder M, Fakhrudin N et al (2013) Polyacetylenes from Notopterygium incisum – new selective partial agonists of peroxisome proliferator-activated receptor-gamma. PLoS One 8:e61755
Baell JB (2016) Feeling nature’s PAINS: natural products, natural product drugs, and pan assay interference compounds (PAINS). J Nat Prod 79:616–628
Bajorath J (2001) Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening. J Chem Inf Comput Sci 41:233–245
Bajorath J (2017) Molecular similarity concepts for informatics applications. Methods Mol Biol 1526:231–245
Bajusz D, Rácz A, Héberger K (2015) Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J Cheminform 7:20
Balaban AT (1997) Neural networks in QSAR and drug design. In: J Devillers (ed) vol. 2 in the series: principles of QSAR and drug design. J Chem Inf Comput Sci 37:628–629
Ban F, Dalal K, Li H et al (2017) Best practices of computer-aided drug discovery: lessons learned from the development of a preclinical candidate for prostate cancer with a new mechanism of action. J Chem Inf Model 57:1018–1028
Bento AP, Gaulton A, Hersey A et al (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:D1083–D1090
Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28:235–242
Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide protein data bank. Nat Struct Mol Biol 10:980
Bock A, Bermudez M, Krebs F et al (2016) Ligand binding ensembles determine graded agonist efficacies at a G protein-coupled receptor. J Biol Chem 291:16375–16389
Bowman GR, Bolin ER, Hart KM et al (2015) Discovery of multiple hidden allosteric sites by combining Markov state models and experiments. Proc Natl Acad Sci U S A 112:2734–2739
Brandt BW, Heringa J, Leunissen JA (2008) SEQATOMS: a web tool for identifying missing regions in PDB in sequence context. Nucleic Acids Res 36:W255–W259
Brown FK (1998) Chapter 35 – Chemoinformatics: what is it and how does it impact drug discovery. Annu Rep Med Chem 33:375–384
Campbell AJ, Lamb ML, Joseph-Mccarthy D (2014) Ensemble-based docking using biased molecular dynamics. J Chem Inf Model 54:2127–2138
Cavasotto CN, Orry AJ (2007) Ligand docking and structure-based virtual screening in drug discovery. Curr Top Med Chem 7:1006–1014
Chavan S, Nicholls IA, Karlsson BC et al (2014) Towards global QSAR model building for acute toxicity: Munro database case study. Int J Mol Sci 15:18162–18174
Chen YC (2015) Beware of docking! Trends Pharmacol Sci 36:78–95
Chen X, Ung CY, Chen Y (2003) Can an in silico drug-target search method be used to probe potential mechanisms of medicinal plant ingredients? Nat Prod Rep 20:432–444
Chen Y, De Bruyn Kops C, Kirchmair J (2017) Data resources for the computer-guided discovery of bioactive natural products. J Chem Inf Model 57:2099–2111
Claude Cohen N (2007) Medicine pipeline: structure-based drug design and the discovery of aliskiren (Tekturna®): perseverance and creativity to overcome a R&D pipeline challenge. Chem Biol Drug Des 70:557–565
Cordier C, Morton D, Murrison S et al (2008) Natural products as an inspiration in the diversity-oriented synthesis of bioactive compound libraries. Nat Prod Rep 25:719–737
Corwin HA, Leo DH, Hoekman D (1995) Exploring QSAR: fundamentals and applications in chemistry and biology. Hydrophobic, electronic, and steric constants. American Chemical Society, Washington, DC
Dalby A, Nourse JG, Hounshell WD et al (1992) Description of several chemical structure file formats used by computer programs developed at molecular design limited. J Chem Inf Comput Sci 32:244–255
Danishuddin KAU (2016) Descriptors and their selection methods in QSAR analysis: paradigm for drug design. Drug Discov Today 21:1291–1302
De vivo M, Masetti M, Bottegoni G et al (2016) Role of molecular dynamics and related methods in drug discovery. J Med Chem 59:4035–4061
Doman TN, Mcgovern SL, Witherbee BJ et al (2002) Molecular docking and high-throughput screening for novel inhibitors of protein tyrosine phosphatase-1B. J Med Chem 45:2213–2221
Ebejer JP, Morris GM, Deane CM (2012) Freely available conformer generation methods: how good are they? J Chem Inf Model 52:1146–1158
El-Houri RB, Mortier J, Murgueitio MS et al (2015) Identification of PPARγ agonists from natural sources using different in silico approaches. Planta Med 81:488–494
Evers A, Hessler G, Matter H et al (2005) Virtual screening of biogenic amine-binding G-protein coupled receptors: comparative evaluation of protein- and ligand-based virtual screening protocols. J Med Chem 48:5448–5465
Feig M, Sugita Y (2013) Reaching new levels of realism in modeling biological macromolecules in cellular environments. J Mol Graph Model 45:144–156
Fernandez-Leiro R, Scheres SHW (2016) Unravelling the structures of biological macromolecules by cryo-EM. Nature 537:339–346
Free SM, Wilson JW (1964) A mathematical contribution to structure-activity studies. J Med Chem 7:395–399
Friedrich NO, De Bruyn Kops C, Flachsenberg F et al (2017) Benchmarking commercial conformer ensemble generators. J Chem Inf Model 57:2719–2728
Friesner RA, Murphy RB, Repasky MP et al (2006) Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J Med Chem 49:6177–6196
Fu W, Chen L, Wang Z et al (2016) Determination of the binding mode for anti-inflammatory natural product xanthohumol with myeloid differentiation protein 2. Drug Des Devel Ther 10:455–463
Gasteiger J, Engel T (2003) Chemoinformatics: a textbook. Wiley-VCH, Weinheim
Gasteiger J (2016) Chemoinformatics: achievements and challenges, a personal view. Molecules 21:151
Gaulton A, Bellis LJ, Bento AP et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100–D1107
Gawehn E, Hiss JA, Schneider G (2016) Deep learning in drug discovery. Mol Inform 35:3–14
Ghose AK, Viswanadhan VN, Wendoloski JJ (1999) A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J Comb Chem 1:55–68
Goldmann D, Pakfeifer P, Hering S et al (2015) Novel scaffolds for modulation of TRPV1 identified with pharmacophore modeling and virtual screening. Future Med Chem 7:243–256
Gong J, Sun P, Jiang N et al (2014) New steroids with a rearranged skeleton as (h)P300 inhibitors from the sponge theonella swinhoei. Org Lett 16:2224–2227
Grienke U, Mihaly-Bison J, Schuster D et al (2011) Pharmacophore-based discovery of FXR-agonists. Part II: identification of bioactive triterpenes from Ganoderma lucidum. Bioorg Med Chem 19(22):6779–6791
Grienke U, Braun H, Seidel N et al (2014) Computer-guided approach to access the anti-influenza activity of licorice constituents. J Nat Prod 77:563–570
Grienke U, Kaserer T, Pfluger F et al (2015) Accessing biological actions of Ganoderma secondary metabolites by in silico profiling. Phytochemistry 114:114–124
Grosdidier A, Zoete V, Michielin O (2011) SwissDock, a protein-small molecule docking web service based on EADock DSS. Nucleic Acids Res 39:W270–W277
Gu J, Gui Y, Chen L et al (2013) Use of natural products as chemical library for drug discovery and network pharmacology. PLoS One 8:e62839
Gumbart JC, Roux B, Chipot C (2013) Standard binding free energies from computer simulations: what is the best strategy? J Chem Theory Comput 9:794–802
Ha H, Debnath B, Odde S (2015) Discovery of novel CXCR2 inhibitors using ligand-based pharmacophore models. J Chem Inf Model 55:1720–1738
Hansch C, Maloney PP, Fujita T et al (1962) Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194:178–180
Harvey AL, Edrada-Ebel R, Quinn RJ (2015) The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov 14:111–129
Hauser AS, Windshügel B (2016) LEADS-PEP: a benchmark data set for assessment of peptide docking performance. J Chem Inf Model 56:188–200
Hauser AS, Attwood MM, Rask-Andersen M et al (2017) Trends in GPCR drug discovery: new agents, targets and indications. Nat Rev Drug Discov 16:829
Hawkins PC, Skillman AG, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50:74–82
Henrick K, Feng Z, Bluhm WF et al (2008) Remediation of the protein data bank archive. Nucleic Acids Res 36:D426–D433
Hessler G, Baringhaus KH (2010) The scaffold hopping potential of pharmacophores. Drug Discov Today Technol 7:e263–e269
Hochleitner J, Akram M, Ueberall M et al (2017) A combinatorial approach for the discovery of cytochrome P450 2D6 inhibitors from nature. Sci Rep 7:8071
Hu Y, Stumpfe D, Bajorath J (2013) Advancing the activity cliff concept. F1000Research 2:199
Huang SY, Grinter SZ, Zou X (2010) Scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directions. Phys Chem Chem Phys 12:12899–12908
Jain AN, Nicholls A (2008) Recommendations for evaluation of computational methods. J Comput Aided Mol Des 22:133–139
Jayaseelan KV, Moreno P, Truszkowski A et al (2012) Natural product-likeness score revisited: an open-source, open-data implementation. BMC Bioinformatics 13:106
Jones G, Willett P, Glen RC et al (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267:727–748
Karaboga AS, Planesas JM, Petronin F et al (2013) Highly specific and sensitive pharmacophore model for identifying CXCR4 antagonists. Comparison with docking and shape-matching virtual screening performance. J Chem Inf Model 53:1043–1056
Karelson M, Lobanov VS, Katritzky AR (1996) Quantum-chemical descriptors in QSAR/QSPR studies. Chem Rev 96:1027–1044
Kaserer T, Schuster D, Rollinger JM (2018) Chapter 6.3. Chemoinformatics in natural product research. In: Applied chemoinformatics: achievements and future opportunities. Wiley-VCH, Weinheim
Kim S, Thiessen PA, Bolton EE et al (2016) PubChem substance and compound databases. Nucleic Acids Res 44:D1202–D1213
Kirchmair J, Markt P, Distinto S et al (2008) Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection – what can we learn from earlier mistakes? J Comput Aided Mol Des 22:213–228
Kirchmair J, Distinto S, Markt P et al (2009) How to optimize shape-based virtual screening: choosing the right query and including chemical information. J Chem Inf Model 49:678–692
Kirchmair J, Goller AH, Lang D et al (2015) Predicting drug metabolism: experiment and/or computation? Nat Rev Drug Discov 14:387–404
Kirchweger B, Kratz JM, Ladurner A et al (2018) In silico workflow for the identification of natural products targeting GPBAR1. Front Chem 6:242
Kitchen DB, Decornez H, Furr JR (2004) Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 3:935–949
Klabunde T, Evers A (2005) GPCR antitarget modeling: pharmacophore models for biogenic amine binding GPCRs to avoid GPCR-mediated side effects. Chembiochem 6:876–889
Koeberle A, Werz O (2014) Multi-target approach for natural products in inflammation. Drug Discov Today 19:1871–1882
Kortagere S, Krasowski MD, Ekins S (2009) The importance of discerning shape in molecular pharmacology. Trends Pharmacol Sci 30:138–147
Kratz JM, Schuster D, Edtbauer M et al (2014) Experimentally validated HERG pharmacophore models as cardiotoxicity prediction tools. J Chem Inf Model 54:2887–2901
Kratz JM, Mair CE, Oettl SK et al (2016) hERG channel blocking ipecac alkaloids identified by combined in silico – in vitro screening. Planta Med 82(11–12):1009–1015
Kratz JM, Grienke U, Scheel O et al (2017) Natural products modulating the hERG channel: heartaches and hope. Nat Prod Rep 34:957–980
Lagorce D, Bouslama L, Becot J et al (2017) FAF-Drugs4: free ADME-tox filtering computations for chemical biology and early stages drug discovery. Bioinformatics 33:3658–3660
Langer T, Wolber G (2004) Pharmacophore definition and 3D searches. Drug Discov Today Technol 1:203–207
Larsson J, Gottfries J, Muresan S et al (2007) ChemGPS-NP: tuned for navigation in biologically relevant chemical space. J Nat Prod 70:789–794
Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20:318–331
Lipinski CA, Lombardo F, Dominy BW et al (2001) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 46:3–26
Liu K, Kokubo H (2017) Exploring the stability of ligand binding modes to proteins by molecular dynamics simulations: a cross-docking study. J Chem Inf Model 57:2514–2522
Lo YC, Rensi SE, Torng W et al (2018) Machine learning in chemoinformatics and drug discovery. Drug Discov Today S1359-6446(17):30469–30465
Ma DL, Chan DSH, Leung CH (2011) Molecular docking for virtual screening of natural product databases. Chem Sci 2:1656–1665
Macarron R (2006) Critical review of the role of HTS in drug discovery. Drug Discov Today 11:277–279
Makeneni S, Thieker DF, Woods RJ (2018) Applying pose clustering and md simulations to eliminate false positives in molecular docking. J Chem Inf Model 58:605–614
Malo M, Brive L, Luthman K et al (2010) Selective pharmacophore models of dopamine D(1) and D(2) full agonists based on extended pharmacophore features. ChemMedChem 5:232–246
Matthias B, Clare H (2011) The Nagoya protocol on access to genetic resources and the fair and equitable sharing of benefits arising from their utilization to the convention on biological diversity. R.E.C.I.E.L. 20:47–61
Medina-Franco JL, Maggiora GM, Giulianotti MA et al (2007) A similarity-based data-fusion approach to the visual characterization and comparison of compound databases. Chem Biol Drug Des 70:393–412
Mollica L, Decherchi S, Zia SR et al (2015) Kinetics of protein-ligand unbinding via smoothed potential molecular dynamics simulations. Sci Rep 5:11539
Mortier J, Prévost JRC, Sydow D et al (2017) Arginase structure and inhibition: catalytic site plasticity reveals new modulation possibilities. Sci Rep 7:13616
Mulholland K, Wu C (2016) Binding of telomestatin to a telomeric g-quadruplex dna probed by all-atom molecular dynamics simulations with explicit solvent. J Chem Inf Model 56:2093–2102
Mysinger MM, Carchia M, Irwin JJ et al (2012) Directory of useful decoys, enhanced (dud-e): better ligands and decoys for better benchmarking. J Med Chem 55:6582–6594
Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity – a review. QSAR Comb Sci 22:1006–1026
O’Boyle NM (2012) Towards a Universal SMILES representation – a standard method to generate canonical SMILES based on the InChI. J Cheminform 4:22–22
Oda A, Tsuchida K, Takakura T et al (2006) Comparison of consensus scoring strategies for evaluating computational models of protein-ligand complexes. J Chem Inf Model 46:380–391
Osterberg F, Morris GM, Sanner MF et al (2002) Automated docking to multiple target structures: incorporation of protein mobility and structural water heterogeneity in AutoDock. Proteins 46:34–40
Pagadala NS, Syed K, Tuszynski J (2017) Software for molecular docking: a review. Biophys Rev 9:91–102
Pándy-Szekeres G, Munk C, Tsonkov TM et al (2018) GPCRdb in 2018: adding GPCR structure models and ligands. Nucleic Acids Res 46:D440–D446
Pang YP, Kozikowski AP (1994) Prediction of the binding sites of huperzine A in acetylcholinesterase by docking studies. J Comput Aided Mol Des 8:669–681
Pardridge WM (2005) The blood-brain barrier: bottleneck in brain drug development. NeuroRx 2:3–14
Payne DJ, Gwynn MN, Holmes DJ et al (2006) Drugs for bad bugs: confronting the challenges of antibacterial discovery. Nat Rev Drug Discov 6:29
Pereira JC, Caffarena ER, Dos Santos CN (2016) boosting docking-based virtual screening with deep learning. J Chem Inf Model 56:2495–2506
Pye CR, Bertin MJ, Lokey RS et al (2017) Retrospective analysis of natural products provides insights for future discovery trends. Proc Natl Acad Sci U S A 114:5601–5606
Rastelli G, Degliesposti G, Del Rio A et al (2009) Binding estimation after refinement, a new automated procedure for the refinement and rescoring of docked ligands in virtual screening. Chem Biol Drug Des 73:283–286
Reker D, Perna AM, Rodrigues T et al (2014) Revealing the macromolecular targets of complex natural products. Nat Chem 6:1072–1078
Ren J, He Y, Chen W et al (2014) Thermodynamic and structural characterization of halogen bonding in protein-ligand interactions: a case study of PDE5 and its inhibitors. J Med Chem 57:3588–3593
Rester U (2008) From virtuality to reality – virtual screening in lead discovery and lead optimization: a medicinal chemistry perspective. Curr Opin Drug Discov Devel 11:559–568
Reymond JL, Van Deursen R, Blum LC et al (2010) Chemical space as a source for new drugs. Med Chem Comm 1:30–38
Rodrigues T, Sieglitz F, Somovilla VJ et al (2016) Unveiling (−)-Englerin A as a modulator of l-type calcium channels. Angewandte Chemie (International Ed in English) 55:11077–11081
Rollinger JM (2009) Accessing target information by virtual parallel screening – the impact on natural product research. Phytochem Lett 2:53–58
Rollinger JM, Wolber G (2011) Computational approaches for the discovery of natural lead structures. In: Bioactive compounds from natural sources, Natural products as lead compounds in drug discovery, 2nd edn. CRC Press, Boca Raton, pp 167–186
Rollinger JM, Haupt S, Stuppner H et al (2004) Combining ethnopharmacology and virtual screening for lead structure discovery: COX-inhibitors as application example. J Chem Inf Comput Sci 44:480–488
Rollinger JM, Bodensieck A, Seger A et al (2005) Discovering COX-inhibiting constituents of Morus root bark: activity-guided versus computer-aided methods. Planta Med 71:399–405
Rollinger JM, Langer T, Stuppner H (2006a) Integrated in silico tools for exploiting the natural products’ bioactivity. Planta Med 72:671–678
Rollinger JM, Langer T, Stuppner H (2006b) Strategies for efficient lead structure discovery from natural products. Curr Med Chem 13:1491–1507
Rollinger JM, Steindl TM, Schuster D et al (2008) Structure-based virtual screening for the discovery of natural inhibitors for human rhinovirus coat protein. J Med Chem 51:842–851
Rollinger JM, Schuster D, Danzl B et al (2009) In silico target fishing for rationalized ligand discovery exemplified on constituents of Ruta graveolens. Planta Med 75:195–204
Rush TS, Grant JA, Mosyak L et al (2005) A shape-based 3-D scaffold hopping method and its application to a bacterial protein−protein interaction. J Med Chem 48:1489–1495
Sabbadin D, Ciancetta A, Moro S (2014) Bridging molecular docking to membrane molecular dynamics to investigate GPCR-Ligand recognition: the human A2A adenosine receptor as a key study. J Chem Inf Model 54:169–183
Santos R, Ursu O, Gaulton A (2016) A comprehensive map of molecular drug targets. Nat Rev Drug Discov 16:19
Sawada R, Kotera M, Yamanishi Y (2014) Benchmarking a wide range of chemical descriptors for drug-target interaction prediction using a chemogenomic approach. Mol Inform 33:719–731
Schneider G (2010) Virtual screening: an endless staircase? Nat Rev Drug Discov 9:273–276
Schneider G (2017) Automating drug discovery. Nat Rev Drug Discov 17:97
Schneider P, Schneider G (2017) A computational method for unveiling the target promiscuity of pharmacologically active compounds. Angew Chem Int Ed Engl 56:11520–11524
Schuster D, Waltenberger B, Kirchmair J et al (2010) Predicting cyclooxygenase inhibition by three-dimensional pharmacophoric profiling. Part I: model generation, validation and applicability in ethnopharmacology. Mol Inform 29:75–86
Schuster D, Markt P, Grienke U et al (2011) Pharmacophore-based discovery of FXR agonists. Part I: model development and experimental validation. Bioorganic Med Chem 19:7168–7180
Scior T, Bender A, Tresadern G et al (2012) Recognizing pitfalls in virtual screening: a critical review. J Chem Inf Model 52:867–881
Seidel T, Ibis G, Bendix F et al (2010) Strategies for 3D pharmacophore-based virtual screening. Drug Discov Today Technol 7:e221–e228
Sheridan RP (2008) Alternative global goodness metrics and sensitivity analysis: heuristics to check the robustness of conclusions from studies comparing virtual screening methods. J Chem Inf Model 48:426–433
Shin WH, Zhu X, Bures MG et al (2015) Three-dimensional compound comparison methods and their application in drug discovery. Molecules 20:12841–12862
Shoichet BK, Mcgovern SL, Wei B et al (2002) Lead discovery using molecular docking. Curr Opin Chem Biol 6:439–446
Sichao W, Youyon GL, Lei X et al (2013) Recent developments in computational prediction of hERG blockage. Curr Top Med Chem 13:1317–1326
Singh N, Guha R, Giulianotti MA et al (2009) Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model 49:1010–1024
Sliwoski G, Kothiwale S, Meiler J et al (2014) Computational methods in drug discovery. Pharmacol Rev 66:334–395
Sliwoski G, Mendenhall J, Meiler J (2016) Autocorrelation descriptor improvements for QSAR: 2DA_Sign and 3DA_Sign. JCAMD 30:209–217
Sohn YS, Park C, Lee Y et al (2013) Multi-conformation dynamic pharmacophore modeling of the peroxisome proliferator-activated receptor γ for the discovery of novel agonists. J Mol Graph Model 46:1–9
Sperandio O, Mouawad L, Pinto E et al (2010) How to choose relevant multiple receptor conformations for virtual screening: a test case of Cdk2 and normal mode analysis. Eur Biophys J 39:1365–1372
Spitzer GM, Heiss M, Mangold M et al (2010) One Concept, three implementations of 3D pharmacophore-based virtual screening: distinct coverage of chemical search space. J Chem Inf Model 50:1241–1247
Spyrakis F, Benedetti P, Decherchi S et al (2015) A Pipeline to enhance ligand virtual screening: integrating molecular dynamics and fingerprints for ligand and proteins. J Chem Inf Model 55:2256–2274
Steindl TM, Schuster D, Laggner C et al (2006a) Parallel screening: a novel concept in pharmacophore modeling and virtual screening. J Chem Inf Model 46:2146–2157
Steindl TM, Schuster D, Wolber G et al (2006b) High-throughput structure-based pharmacophore modelling as a basis for successful parallel virtual screening. JCAMD 20:703–715
Strohl WR (2000) The role of natural products in a modern drug discovery program. Drug Discov Today 5:39–41
Stumpfe D, De La Vega De Leon A, Dimova D et al (2014) Advancing the activity cliff concept, part II. F1000Res 3:75
Su H, Yan J, Xu J et al (2015) Stepwise high-throughput virtual screening of Rho kinase inhibitors from natural product library and potential therapeutics for pulmonary hypertension. Pharm Biol 53:1201–1206
Tarcsay A, Paragi G, Vass M et al (2013) The impact of molecular dynamics sampling on the performance of virtual screening against GPCRs. J Chem Inf Model 53:2990–2999
Taylor RD, Jewsbury PJ, Essex JW (2002) A review of protein-small molecule docking methods. JCAMD 16:151–166
Tetko IV (2003) The WWW as a tool to obtain molecular parameters. Mini Rev Med Chem 3:809–820
Tian S, Sun H, Pan P et al (2014) Assessing an ensemble docking-based virtual screening strategy for kinase targets by considering protein flexibility. J Chem Inf Model 54:2664–2679
Todeschini R, Consonni V (2008) Handbook of molecular descriptors. Wiley-VCH, Weinheim
Truchon JF, Bayly CI (2007) Evaluating virtual screening methods: good and bad metrics for the “early recognition” problem. J Chem Inf Model 47:488–508
Van Drie JH (2010) History of 3D pharmacophore searching: commercial, academic and open-source tools. Drug Discov Today Technol 7:e255–e262
Varnek A, Baskin I (2012) Machine learning methods for property prediction in chemoinformatics: quo vadis? J Chem Inf Model 52:1413–1437
Veber DF, Johnson SR, Cheng HY et al (2002) Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 45:2615–2623
Vuorinen A, Engeli R, Meyer A et al (2014) Ligand-based pharmacophore modeling and virtual screening for the discovery of novel 17β-Hydroxysteroid dehydrogenase 2 inhibitors. J Med Chem 57:5995–6007
Waltenberger B, Atanasov AG, Heiss EH et al (2016) Drugs from nature targeting inflammation (DNTI): a successful Austrian interdisciplinary network project. Monatsh Chem 47:479–491
Wang G, Zhu W (2016) Molecular docking for drug discovery and development: a widely used approach but far from perfect. Future Med Chem 8:1707–1710
Wang JC, Chu PY, Chen CM et al (2012) idTarget: a web server for identifying protein targets of small chemical molecules with robust scoring functions and a divide-and-conquer docking approach. Nucleic Acids Res 40:W393–W399
Wang Z, Sun H, Yao X et al (2016) Comprehensive evaluation of ten docking programs on a diverse set of protein-ligand complexes: the prediction accuracy of sampling power and scoring power. Phys Chem Chem Phys 18:12964–12975
Warren GL, Andrews CW, Capelli AM et al (2006) A critical assessment of docking programs and scoring functions. J Med Chem 49:5912–5931
Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Model 28:31–36
Wermuth CG, Ganellin CR, Lindberg P et al (1998) Glossary of terms used in medicinal chemistry (IUPAC recommendations 1998). Pure Appl Chem 70:1129
Wetzel S, Schuffenhauer A, Roggo S et al (2007) Cheminformatic analysis of natural products and their chemical space. Chimia 61:355–360
Wieder M, Garon A, Perricone U et al (2017) Common Hits approach: combining pharmacophore modeling and molecular dynamics simulations. J Chem Inf Model 57:365–385
Wójcikowski M, Ballester PJ, Siedlecki P (2017) Performance of machine-learning scoring functions in structure-based virtual screening. Sci Rep 7:46710
Yan SF, King FJ, He Y et al (2006) Learning from the data: mining of large high-throughput screening databases. J Chem Inf Model 46:2381–2395
Yang Y, Xu Z, Zhang Z et al (2015) Like-charge guanidinium pairing between ligand and receptor: an unusual interaction for drug discovery and design? J Phys Chem B 119:11988–11997
Zhu T, Cao S, Su PC et al (2013) Hit identification and optimization in virtual screening: practical recommendations based on a critical literature analysis. J Med Chem 56:6560–6572
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kirchweger, B., Rollinger, J.M. (2018). Virtual Screening for the Discovery of Active Principles from Natural Products. In: Cechinel Filho, V. (eds) Natural Products as Source of Molecules with Therapeutic Potential. Springer, Cham. https://doi.org/10.1007/978-3-030-00545-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-00545-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00544-3
Online ISBN: 978-3-030-00545-0
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)