Abstract
Herein is presented a tutorial overview on selected chemoinformatics methods useful for assembling, curating/preparing a chemical database, and assessing its diversity and chemical space. Methods for evaluating the structure–activity relationships (SAR) and polypharmacology are also included. Usage of open source tools is emphasized. Step-by-step KNIME workflows are used for illustrating the methods. The methods described in this chapter are applied onto a chemical database especially relevant for epi-polypharmacology that is an emerging area in drug discovery. However, the methods described herein could be extended to other therapeutic areas and potentially to other areas of chemistry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rosini M (2014) Polypharmacology: the rise of multitarget drugs over combination therapies. Future Med Chem 6:485–487. https://doi.org/10.4155/fmc.14.25
Méndez-Lucio O, Naveja JJ, Vite-Caritino H et al (2016) Review. One drug for multiple targets: a computational perspective. J Mex Chem Soc 60:168–181
Saldívar-González FI, Naveja JJ, Palomino-Hernández O, Medina-Franco JL (2017) Getting SMARt in drug discovery: chemoinformatics approaches for mining structure–multiple activity relationships. RSC Adv 7:632–641. https://doi.org/10.1039/C6RA26230A
González-Medina M, Naveja JJ, Sánchez-Cruz N, Medina-Franco JL (2017) Open chemoinformatic resources to explore the structure, properties and chemical space of molecules. RSC Adv 7:54153–54163. https://doi.org/10.1039/C7RA11831G
Berthold MR, Cebron N, Dill F et al (2009) KNIME – the Konstanz information miner. SIGKDD Explor Newsl 11:26. https://doi.org/10.1145/1656274.1656280
Varnek A (2017) Tutorials in chemoinformatics. https://doi.org/10.1002/9781119161110
Saldívar-González FI, Hernández-Luis F, Lira-Rocha A, Medina-Franco JL (2017) Manual de Quimioinformática, 1st edn. Universidad Nacional Autónoma de México, Mexico City
González-Medina M, Medina-Franco JL (2017) Platform for unified molecular analysis: PUMA. J Chem Inf Model 57:1735–1740. https://doi.org/10.1021/acs.jcim.7b00253
González-Medina M, Méndez-Lucio O, Medina-Franco JL (2017) Activity landscape plotter: a web-based application for the analysis of structure-activity relationships. J Chem Inf Model 57:397–402. https://doi.org/10.1021/acs.jcim.6b00776
González-Medina M, Prieto-Martínez FD, Owen JR, Medina-Franco JL (2016) Consensus diversity plots: a global diversity analysis of chemical libraries. J Cheminform 8:63. https://doi.org/10.1186/s13321-016-0176-9
Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL (2017) Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers. https://doi.org/10.1007/s11030-017-9802-3
Richter L, Ecker GF (2015) Medicinal chemistry in the era of big data. Drug Discov Today Technol 14:37–41. https://doi.org/10.1016/j.ddtec.2015.06.001
Law V, Knox C, Djoumbou Y et al (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42:D1091–D1097. https://doi.org/10.1093/nar/gkt1068
Bento AP, Gaulton A, Hersey A et al (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:D1083–D1090. https://doi.org/10.1093/nar/gkt1031
Irwin JJ, Shoichet BK (2005) ZINC – a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182. https://doi.org/10.1021/ci049714+
Liu T, Lin Y, Wen X et al (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35:D198–D201. https://doi.org/10.1093/nar/gkl999
Lavecchia A, Cerchia C (2016) In silico methods to address polypharmacology: current status, applications and future perspectives. Drug Discov Today 21:288–298. https://doi.org/10.1016/j.drudis.2015.12.007
Fourches D, Muratov E, Tropsha A (2016) Trust, but verify II: a practical guide to chemogenomics data curation. J Chem Inf Model 56:1243–1252. https://doi.org/10.1021/acs.jcim.6b00129
Hersey A, Chambers J, Bellis L et al (2015) Chemical databases: curation or integration by user-defined equivalence? Drug Discov Today Technol 14:17–24. https://doi.org/10.1016/j.ddtec.2015.01.005
Miller MA (2002) Chemical database techniques in drug discovery. Nat Rev Drug Discov 1:220–227. https://doi.org/10.1038/nrd745
Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977–5010. https://doi.org/10.1021/jm4004285
Mansouri K, Abdelaziz A, Rybacka A et al (2016) CERAPP: collaborative estrogen receptor activity prediction project. Environ Health Perspect 124:1023–1033. https://doi.org/10.1289/ehp.1510267
Gally J-M, Bourg S, Do Q-T et al (2017) Vsprep: a general KNIME workflow for the preparation of molecules for virtual screening. Mol Inform. https://doi.org/10.1002/minf.201700023
Naveja JJ, Medina-Franco JL (2017) Insights from pharmacological similarity of epigenetic targets in epipolypharmacology. Drug Discov Today. https://doi.org/10.1016/j.drudis.2017.10.006
Medina-Franco JL, Martinez-Mayorga K, Meurice N (2014) Balancing novelty with confined chemical space in modern drug discovery. Expert Opin Drug Discov 9:151–165. https://doi.org/10.1517/17460441.2014.872624
Sheridan RP, Kearsley SK (2002) Why do we need so many chemical similarity search methods? Drug Discov Today 7:903–911. https://doi.org/10.1016/S1359-6446(02)02411-X
Medina-Franco JL, Maggiora GM (2013) Molecular similarity analysis. In: Bajorath J (ed) Chemoinformatics for drug discovery. Wiley, Hoboken, NJ, pp 343–399. https://doi.org/10.1002/9781118742785.ch15
Singh N, Guha R, Giulianotti MA et al (2009) Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model 49:1010–1024. https://doi.org/10.1021/ci800426u
Xu J, Hagler A (2002) Chemoinformatics and drug discovery. Molecules 7:566–600. https://doi.org/10.3390/70800566
Gortari EF, Medina-Franco JL (2015) Epigenetic relevant chemical space: a chemoinformatic characterization of inhibitors of DNA methyltransferases. RSC Adv 5:87465–87476. https://doi.org/10.1039/C5RA19611F
Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233. https://doi.org/10.1016/j.drudis.2007.01.011
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t
Ewing T, Baber JC, Feher M (2006) Novel 2D fingerprints for ligand-based virtual screening. J Chem Inf Model 46:2423–2431. https://doi.org/10.1021/ci060155b
Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42:1273–1280. https://doi.org/10.1021/ci010132r
Jaccard P (1901) Etude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat 37:547–579
Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893. https://doi.org/10.1021/jm9602928
Xu Y-J, Johnson M (2002) Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J Chem Inf Comput Sci 42:912–926
Medina-Franco J, MartÃnez-Mayorga K, Bender A, Scior T (2009) Scaffold diversity analysis of compound data sets using an entropy-based measure. QSAR Comb Sci 28:1551–1560. https://doi.org/10.1002/qsar.200960069
Shanmugasundaram V, Maggiora GM (2001) Characterizing property and activity landscapes using an information-theoretic approach. CINF-032. In 222nd ACS National Meeting, Chicago, IL, USA; August 26–30, 2001; American Chemical Society, Washington, DC
Guha R (2012) Exploring structure-activity data using the landscape paradigm. Wiley Interdiscip Rev Comput Mol Sci. https://doi.org/10.1002/wcms.1087
Bajorath J, Peltason L, Wawer M et al (2009) Navigating structure-activity landscapes. Drug Discov Today 14:698–705. https://doi.org/10.1016/j.drudis.2009.04.003
Medina-Franco JL (2012) Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 52:2485–2493. https://doi.org/10.1021/ci300362x
Medina-Franco JL, Petit J, Maggiora GM (2006) Hierarchical strategy for identifying active chemotype classes in compound databases. Chem Biol Drug Des 67:395–408. https://doi.org/10.1111/j.1747-0285.2006.00397.x
Maggiora G, Gokhale V (2017) A simple mathematical approach to the analysis of polypharmacology and polyspecificity data. [version 1; referees: 3 approved, 1 approved with reservations]. F1000Res. https://doi.org/10.12688/f1000research.11517.1
Pérez-Villanueva J, Santos R, Hernández-Campos A et al (2011) Structure–activity relationships of benzimidazole derivatives as antiparasitic agents: dual activity-difference (DAD) maps. Med Chem Commun 2:44–49. https://doi.org/10.1039/C0MD00159G
Yongye AB, Medina-Franco JL (2012) Data mining of protein-binding profiling data identifies structural modifications that distinguish selective and promiscuous compounds. J Chem Inf Model 52:2454–2461. https://doi.org/10.1021/ci3002606
Osolodkin DI, Radchenko EV, Orlov AA et al (2015) Progress in visual representations of chemical space. Expert Opin Drug Discov 10:959–973. https://doi.org/10.1517/17460441.2015.1060216
Medina-Franco J, Martinez-Mayorga K, Giulianotti M et al (2008) Visualization of the chemical space in drug discovery. Curr Comput Aided Drug Des 4:322–333. https://doi.org/10.2174/157340908786786010
Fernández-de Gortari E, García-Jacas CR, Martinez-Mayorga K, Medina-Franco JL (2017) Database fingerprint (DFP): an approach to represent molecular databases. J Cheminform 9:9. https://doi.org/10.1186/s13321-017-0195-1
Naveja JJ, Medina-Franco JL (2017) ChemMaps: towards an approach for visualizing the chemical space based on adaptive satellite compounds [version 1; referees: 1 approved, 2 approved with reservations]. F1000Res. https://doi.org/10.12688/f1000research.12095.1
Naveja JJ, Medina-Franco JL (2015) Activity landscape sweeping: insights into the mechanism of inhibition and optimization of DNMT1 inhibitors. RSC Adv 5:63882–63895. https://doi.org/10.1039/C5RA12339A
Wale N, Karypis G (2009) Target fishing for chemical compounds using target-ligand activity data and ranking based methods. J Chem Inf Model 49:2190–2201. https://doi.org/10.1021/ci9000376
Jenkins JL, Bender A, Davies JW (2006) In silico target fishing: predicting biological targets from chemical structure. Drug Discov Today Technol 3:413–421. https://doi.org/10.1016/j.ddtec.2006.12.008
Hansch C, Maloney PP, Fujita T, Muir RM (1962) Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194:178–180. https://doi.org/10.1038/194178b0
Nettles JH, Jenkins JL, Bender A et al (2006) Bridging chemical and biological space: “target fishing” using 2D and 3D molecular descriptors. J Med Chem 49:6802–6810. https://doi.org/10.1021/jm060902w
Cramer RD (2012) The inevitable QSAR renaissance. J Comput Aided Mol Des 26:35–38. https://doi.org/10.1007/s10822-011-9495-0
Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20:318–331. https://doi.org/10.1016/j.drudis.2014.10.012
Yao Z-J, Dong J, Che Y-J et al (2016) TargetNet: a web service for predicting potential drug-target interaction profiling via multi-target SAR models. J Comput Aided Mol Des 30:413–424. https://doi.org/10.1007/s10822-016-9915-2
Nidhi, Glick M, Davies JW, Jenkins JL (2006) Prediction of biological targets for compounds using multiple-category Bayesian models trained on chemogenomics databases. J Chem Inf Model 46:1124–1133. https://doi.org/10.1021/ci060003g
Kawai K, Fujishima S, Takahashi Y (2008) Predictive activity profiling of drugs by topological-fragment-spectra-based support vector machines. J Chem Inf Model 48:1152–1160. https://doi.org/10.1021/ci7004753
Nikolic K, Mavridis L, Djikic T et al (2016) Drug design for CNS diseases: polypharmacological profiling of compounds using cheminformatic, 3D-QSAR and virtual screening methodologies. Front Neurosci 10:265. https://doi.org/10.3389/fnins.2016.00265
Rognan D (2010) Structure-based approaches to target fishing and ligand profiling. Mol Inform 29:176–187. https://doi.org/10.1002/minf.200900081
Awale M, Reymond J-L (2017) The polypharmacology browser: a web-based multi-fingerprint target prediction tool using ChEMBL bioactivity data. J Cheminform 9:11. https://doi.org/10.1186/s13321-017-0199-x
Kunimoto R, Dimova D, Bajorath J (2017) Application of a new scaffold concept for computational target deconvolution of chemical cancer cell line screens. ACS Omega 2:1463–1468. https://doi.org/10.1021/acsomega.7b00215
Reker D, Rodrigues T, Schneider P, Schneider G (2014) Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc Natl Acad Sci U S A 111:4067–4072. https://doi.org/10.1073/pnas.1320001111
Zheng W, Thorne N, McKew JC (2013) Phenotypic screens as a renewed approach for drug discovery. Drug Discov Today 18:1067–1073. https://doi.org/10.1016/j.drudis.2013.07.001
Lee J, Bogyo M (2013) Target deconvolution techniques in modern phenotypic profiling. Curr Opin Chem Biol 17:118–126. https://doi.org/10.1016/j.cbpa.2012.12.022
Mugumbate G, Mendes V, Blaszczyk M et al (2017) Target identification of mycobacterium tuberculosis phenotypic hits using a concerted chemogenomic, biophysical, and structural approach. Front Pharmacol 8:681. https://doi.org/10.3389/fphar.2017.00681
Acknowledgements
This work was supported by the Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT) grant IA203718 and National Council of Science and Technology (CONACyT), Mexico grant number 282785. JJN, FIS-G, and NS-C are thankful to CONACyT for the granted scholarships number 622969, 629458, and 335997, respectively.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Additional information
This work is dedicated to the loving memory of Nicolás Medina Sandoval.
1 Electronic Supplementary Material
The following supplementary KNIME files with exemplary workflows are provided:
Supplementary KNIME Workflow 1
Chemical preprocessing and database curation (KNWF 168 kb)
Supplementary KNIME Workflow 2
Chemical diversity analysis (KNWF 92 kb)
Supplementary KNIME Workflow 3
Consensus diversity plots (KNWF 127 kb)
Supplementary KNIME Workflow 4
SmARt analyses (KNWF 211 kb)
Supplementary KNIME Workflow 5
Chemical space (KNWF 399 kb)
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media New York
About this protocol
Cite this protocol
Naveja, J.J., Saldívar-González, F.I., Sánchez-Cruz, N., Medina-Franco, J.L. (2018). Cheminformatics Approaches to Study Drug Polypharmacology. In: Roy, K. (eds) Multi-Target Drug Design Using Chem-Bioinformatic Approaches. Methods in Pharmacology and Toxicology. Humana Press, New York, NY. https://doi.org/10.1007/7653_2018_6
Download citation
DOI: https://doi.org/10.1007/7653_2018_6
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8732-0
Online ISBN: 978-1-4939-8733-7
eBook Packages: Springer Protocols