Skip to main content

Chemoinformatics and Library Design

  • Protocol
  • First Online:
Chemical Library Design

Part of the book series: Methods in Molecular Biology ((MIMB,volume 685))

Abstract

This chapter provides a brief overview of chemoinformatics and its applications to chemical library design. It is meant to be a quick starter and to serve as an invitation to readers for more in-depth exploration of the field. The topics covered in this chapter are chemical representation, chemical data and data mining, molecular descriptors, chemical space and dimension reduction, quantitative structure–activity relationship, similarity, diversity, and multiobjective optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brown, F. B. (1998) Chemoinformatics: what is it and how does it impact drug discovery. Annu Rep Med Chem 33, 375–384.

    Article  CAS  Google Scholar 

  2. Bohacek, R. S., McMartin, C., Guida, W. C. (1996) The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev 16, 3–50.

    Article  PubMed  CAS  Google Scholar 

  3. Walters, W. P., Stahl, M. T., Murcho, M. A. (1998) Virtual screening–an overview. Drug Discov Today 3, 160–178.

    Article  CAS  Google Scholar 

  4. Gasteiger, J. (ed.) (2003) Handbook of Chemoinformatics: From Data to Knowledge, Wiley-VCH, Weinhiem.

    Google Scholar 

  5. Bajorath, J. (ed.) (2004) Chemoinformatics: Concepts, Methods, and Tools for Drug Discovery, Humana Press, Totowa, NJ.

    Google Scholar 

  6. Oprea, T. I. (ed.) (2005) Chemoinformatics in Drug Discovery, Wiley-VCH, Weinheim.

    Book  Google Scholar 

  7. Leach, A. R. and Gillet, V. J. (2007) An Introduction to Chemoinformatics, Springer, London.

    Book  Google Scholar 

  8. Bunin, B. A., Siesel, B., Morales, G. A., Bajorath, J. (2007) Chemoinformatics: Theory, Practice, & Products, Springer, The Netherlands.

    Google Scholar 

  9. http://www.symyx.com/solutions/white_papers/ctfile_formats.jsp, last accessed February, 2010.

  10. Weininger, D. (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28, 31–36.

    Article  CAS  Google Scholar 

  11. Weininger, D. (1989) SMILES, 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29, 97–101.

    Article  CAS  Google Scholar 

  12. Weininger, D. (1990) SMILES, 3. Depict. Graphical depiction of chemical structures. J Chem Inf Comput Sci 30, 237–243.

    Article  CAS  Google Scholar 

  13. http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html, last accessed February, 2010.

  14. Simsion, G. C., Witt, G. C. (2001) Data Modeling Essentials, 2nd ed. Coriolis, Scottsdale, USA.

    Google Scholar 

  15. Todeschini, R., Consonni, V. (2009) Molecular Descriptors for Chemoinformatics Vol. 1, 2nd ed. Wiley-VCH, Weinheim, Germany.

    Book  Google Scholar 

  16. Jolliffe, I. T. (2002) Principal Component Analysis, 2nd ed. Springer, New York.

    Google Scholar 

  17. Borg, I. and Groenen, P. J. F. (2005) Modern Multidimensional Scaling: Theory and Applications, 2nd ed. Springer, New York.

    Google Scholar 

  18. Domine, D., Devillers, J., Chastrette, M., Karcher, W. (1993) Non-linear mapping for structure-activity and structure-property modeling. J Chemometrics 7, 227–242.

    Article  CAS  Google Scholar 

  19. Wermuth, C. G. (2006) Similarity in drugs: reflections on analogue design. Drug Discov Today 11, 348–354.

    Article  PubMed  CAS  Google Scholar 

  20. Willett, P. (2000) Chemoinformatics–similarity and diversity in chemical libraries. Curr Opin Biotech 11, 85–88.

    Article  PubMed  CAS  Google Scholar 

  21. Maldonado, A. G., Doucet, J. P., Petitjean, M., Fan, B. -T. (2006) Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 10, 39–79.

    Article  PubMed  CAS  Google Scholar 

  22. Willett, P. (2006) Similarity-based virtual screening using 2D fingerprints. Drug Discov Today 11, 1046–1053.

    Article  PubMed  CAS  Google Scholar 

  23. Holliday, J. D., Hu, C. -Y., Willett, P. (2002) Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bitstrings. Comb Chem High Throughput Screening 5, 155–166.

    CAS  Google Scholar 

  24. Dunbar J. B. (1997) Cluster-based selection. Perspect Drug Discov Des 7/8, 51–63.

    CAS  Google Scholar 

  25. Mason J. S., Pickett S. D. (1997) Partition-based selection Perspect Drug Discov Des 7/8, 85–114.

    CAS  Google Scholar 

  26. Rusinko, A. III, Farmen, M. W., Lambert, C. G. et al. (1999) Analysis of a large structure/biological activity dataset using recursive partitioning. J Chem Inf Comput Sci 39, 1017–1026.

    Article  PubMed  CAS  Google Scholar 

  27. Lajiness, M. S. (1997) Dissimilarity-based compound selection techniques. Perspect Drug Discov Des 7/8, 65–84.

    CAS  Google Scholar 

  28. Pickett, S. D., Luttman, C., Guerin, V., Laoui, A., James, E. (1998) DIVSEL and COMPLIB–strategies for the design and comparison of combinatorial libraries using pharmacophore descriptors. J Chem Inf Comput Sci 38, 144–150.

    Article  PubMed  CAS  Google Scholar 

  29. Hansch, C., Hoekman, D., Gao, H. (1996) Comparative QSAR: toward a deeper understanding of chemicobiological interactions. Chem Rev 96, 1045–1074.

    Article  PubMed  CAS  Google Scholar 

  30. Jaffé, H. H. (1953) A reexamination of the Hammett equation. Chem Rev 53, 191–261.

    Article  Google Scholar 

  31. Hammett, L. P. (1935) Some relations between reaction rates and equilibrium. Chem Rev 17, 125–136.

    Article  CAS  Google Scholar 

  32. Hammett, L. P. (1937) The effect of structure upon the reactions of organic compounds. Benzene derivatives. J Am Chem Soc 59, 96–103.

    Article  CAS  Google Scholar 

  33. Hansch, C., Maloney, P. P., Fujita, T., Muir, R. M. (1962) Correlation of biological activity of phenoxyacetic acids with Hammett substituent constants and partition coefficients. Nature 194, 178–180.

    Article  CAS  Google Scholar 

  34. Hansch, C. (1993) Quantitative structure-activity relationships and the unnamed science. Acc Chem Res 26, 147–153.

    Article  CAS  Google Scholar 

  35. Livingstone, D. J. (2004) Building QSAR models: a practical guide, in (Cronin, M. T. D., Livingstone, D. J. eds.) Predicting Chemical Toxicity and Fate. CRC Press, Boca Raton, FL, 2004, pp. 151–170.

    Google Scholar 

  36. Walker, J. D., Dearden, J. C., Schultz, T. W., Jaworska, J., Comber M. H. I. (2003) in (Walker, J. D. ed.) QSARs for New Practitioners, in QSARs for Pollution Prevention, Toxicity Screening, Risk Assessment, and Web Applications. SETAC Press, Pensacola, FL, pp. 3–18.

    Google Scholar 

  37. Walker, J. D., Jaworska, J., Comber, M. H. I., Schultz, T. W., Dearden, J. C. (2003) Guidelines for developing and using quantitative structure–activity relationships. Environ Toxicol Chem 22, 1653–1665.

    Article  PubMed  CAS  Google Scholar 

  38. Cronin, M. T. D., Schultz, T. W. (2003) Pitfalls in QSAR J Theoret Chem (Theochem) 622, 39–51.

    CAS  Google Scholar 

  39. OECD Principles for the Validation of (Q)SARs, http://www.oecd.org/dataoecd/33/37/37849783.pdf, last accessed February, 2010.

  40. Tropsha, A., Golbraikh, A. (2007) Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr Pharmaceut Design 13, 3494–3504.

    Article  CAS  Google Scholar 

  41. Dearden, J. C., Cronin, M. T. D., Kaiser, K. L. E. (2009) How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR). SAR and QSAR in Environ Res 20, 241–266.

    Article  CAS  Google Scholar 

  42. Free, S. M., Wilson, J. W. (1964) A mathematical contribution to structure-activity studies. J Med Chem 7, 395–399.

    Article  PubMed  CAS  Google Scholar 

  43. Xing, L., Glen, R. C. (2002) Novel methods for the prediction of logP, pKa, and logD. J Chem Inf Comput Sci 42, 796–805.

    Article  PubMed  CAS  Google Scholar 

  44. Lombardo, F., Obach, R. S., et al. (2006) A hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution of drugs in human. J Med Chem 49, 2262–2267.

    Article  PubMed  CAS  Google Scholar 

  45. Nicolaou, C. A., Brown, N., Pattichis, C. S. (2007) Molecular optimization using computational multi-objective methods Curr Opin Drug Discov Develop 10, 316–324.

    CAS  Google Scholar 

  46. Gillet, V. J., Willett, P., Bradshaw, J., Green, D. V. S. (1999) Selecting combinatorial libraries to optimize diversity and physical properties. J Chem Inf Comput Sci 39, 169–177.

    Article  CAS  Google Scholar 

  47. Brown, R.D., Hassan, M., Waldman, M. (2000) Combinatorial library design for diversity, cost efficiency, and drug-like characters. J Mol Graph Model 18, 427–437.

    Article  PubMed  CAS  Google Scholar 

  48. Gillet, V. J., Khatib, W., Willett, P., Fleming, P. J., Green, D. V. S. (2002) Combinatorial library design using a multiobjective genetic algorithm. J Chem Inf Comput Sci 42, 375–385.

    Article  PubMed  CAS  Google Scholar 

  49. Chen, G., Zheng, S., Luo, X., Shen, J., Zhu, W., Liu, H., Gui, C., Zhang, J., Zheng, M., Puah, C.M., Chen, K., Jiang, H. (2005) Focused combinatorial library design based on structural diversity, drug likeness and binding affinity score. J Comb Chem 7, 398–406.

    Article  PubMed  CAS  Google Scholar 

  50. Eichfelder, G. (2008) Adaptive Scalarization Methods in Multiobjective Optimization, Springer-Verlag, Berlin, Germany.

    Book  Google Scholar 

  51. Abraham, A., Jain, L., Goldberg, R. (eds.) (2005) Evolutionary Multiobjective Optimization: Theoretical Advances and Applications, Springer-Verlag, London, UK.

    Book  Google Scholar 

  52. Van Veldhurizen, D. A., Lamont, G. B. (2000) Multiobjective evolutionary algorithms: analyzing the state-of-the-art. Evol Comput 8, 125–147.

    Article  Google Scholar 

  53. Gillet, V. J., Willett, P., Bradshaw, J., Green, D. V. S. (1999) Selecting combinatorial libraries to optimize diversity and physical properties. J Chem Inf Comput Sci 39, 169–177.

    Article  CAS  Google Scholar 

  54. Zheng, W., Hung, S. T., Saunders, J. T., Seibel, G. L. (2000) PICCOLO: a tool for combinatorial library design via multicriterion optimization. Pac Symp Biocomput 5, 585–596.

    Google Scholar 

  55. A multi-endpoint optimization tool with a graphics user interface developed at Pfizer–La Jolla by Zhou, J. Z., Kong, X., Mattaparti, S, et al. (unpublished).

    Google Scholar 

  56. Lipinski, C. A., Lombardo, F., Dominy, B. W., Feeney, P. J. (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23, 3–25.

    Article  CAS  Google Scholar 

  57. Gillet, V. J., Willett, P., Bradshaw, J. (1998) Identification of biological activity profiles using substructural analysis and genetic algorithms. J Chem Inf Comput Sci 38, 165–179.

    Article  PubMed  CAS  Google Scholar 

  58. Walter, W. P., Stahl, M. T., Murcko, M. A. (1998) Virtual screening–an overview. Drug Discov Today 3, 160–178.

    Article  Google Scholar 

  59. Bajorath, J. (2002) Integration of virtual and high-throughput screening. Nat Rev Drug Discov 1, 882–894.

    Article  PubMed  CAS  Google Scholar 

  60. Reddy, A. S., Pati, S. P., Kumar, P. P., Pradeep, H. N., Sastry, G. N. (2007) Virtual screening in drug discovery–a computational perspective. Curr Prot Pept Sci 8, 329–351.

    Article  CAS  Google Scholar 

  61. Klebe, G. (ed.) (2000) Virtual Screening: An Alternative or Complement to High Throughput Screening? Kluwer Academic Publishers, Boston.

    Google Scholar 

  62. Alvarez, J., Shoichet, B. (ed.) (2005) Virtual Screening in Drug Discovery, Taylor & Francis, Boca Raton, USA.

    Book  Google Scholar 

  63. Varnek, A., Tropsha, A. (ed.) (2008) Chemoinformatics: An Approach to Virtual Screening, RSC, Cambridge, UK.

    Book  Google Scholar 

  64. Rishton, G. M. (1997) Reactive compounds and in vitro false positives in HTS. Drug Discov Today 2, 382–384.

    Article  CAS  Google Scholar 

  65. Walters, W. P., et al. (1998) Can we learn to distinguish between ‘druglike’ and ‘nondrug-like’ molecules? J Med Chem 41, 3314–3324.

    Article  PubMed  Google Scholar 

  66. Sadowski, J., Kubinyi, H. A. (1998) A scoring scheme for discriminating between drugs and nondrugs. J Med Chem 41, 3325–3329.

    Article  PubMed  Google Scholar 

  67. Rishton, G. M. (2003) Nonleadlikeness and leadlikeness in biochemical screening. Drug Discov Today 8, 86–96.

    Article  PubMed  CAS  Google Scholar 

  68. http://dtp.nci.nih.gov/docs/3d_database/Structural_information/structural_data.html, last accessed February, 2010.

  69. Kuntz, I. D. (1992) Structure-based strategies for drug design and discovery. Science 257, 1078–1082.

    Article  PubMed  CAS  Google Scholar 

  70. Kitchen, D. B., Decornez, H., Furr, J. R., Bajorath, J. (2004) Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 3, 935–949.

    Article  PubMed  CAS  Google Scholar 

  71. Sun, H. (2008) Pharmacophore-based virtual screening. Curr Med Chem 15, 1018–1024.

    Article  PubMed  CAS  Google Scholar 

  72. Melville, J. L., Burke, E. K., Hirst, J. D. (2009) Machine learning in virtual screening. Comb Chem High Throughput Screening 12, 332–343.

    Article  CAS  Google Scholar 

  73. Harper, G., Pickett, S. D., Green, D. V. S. (2004) Design of a compound screening collection for use in high throughput screening. Comb Chem High Throughput Screening 7, 63–70.

    CAS  Google Scholar 

  74. Schüller, A., Hähnke, V., Schneider, G. (2007) SmiLib v2.0: a Java-based tool for rapid combinatorial library enumeration QSAR. Comb Sci 26, 407–410.

    Article  Google Scholar 

  75. Pipeline Pilot distributed by Accelrys Inc. can be used to enumerate libraries defined either by reactions or by Markush structures: http://accelrys.com/resource-center/case-studies/enumeration.html, last accessed February, 2010.

  76. CombiLibMaker is software distributed by Tripos Inc.: http://tripos.com/data/SYBYL/combilibmaker_072505.pdf, last accessed February, 2010.

  77. Yasri, A., Berthelot, D., Gijsen, H., Thielemans, T., Marichal, P., Engels, M., Hoflack, J. (2004) REALISIS: a medicinal chemistry-oriented reagent selection, library design, and profiling platform. J Chem Inf Comput Sci 44, 2199–2206.

    Article  PubMed  CAS  Google Scholar 

  78. (a) Peng, Z., Yang, B., Mattaparti, S., Shulok, T., Thacher, T., Kong, J., Kostrowicki, J., Hu, Q., Na, J., Zhou, J. Z., Klatte, K., Chao, B., Ito, S., Clark, J., Coner, C., Waller, C., Kuki, A. PGVL Hub: an integrated desktop tool for medicinal chemists to streamline design and synthesis of chemical libraries and singleton compounds, in (Zhou, J. Z., ed.) Chemical Library Design. Humana Press, New York, Chapter 15.

  79. (b) Truchon, J. -F. GLARE: a tool for product-oriented design of combinatorial libraries, in (Zhou, J. Z., ed.) Chemical Library Design. Humana Press, New York, Chapter 17.

  80. (c) Lam, T. H., Bernardo, P. H., Chai, C. L. L., Tong, J. C. CLEVER – a general design tool for combinatorial libraries, in (Zhou, J. Z., ed.) Chemical Library Design. Humana Press, New York, Chapter 18.

  81. Shi, S., Peng, Z., Kostrowicki, J., Paderes, G., Kuki, A. (2000) Efficient combinatorial filtering for desired molecular properties of reaction products. J Mol Graph Model 18, 478–496.

    Article  PubMed  CAS  Google Scholar 

  82. Zhou, J. Z., Shi, S., Na, J., Peng, Z., Thacher, T. (2009) Combinatorial library-based design with basis products. J Comput Aided Mol Des 23, 725–736.

    Article  PubMed  CAS  Google Scholar 

  83. Grabowski, K., Baringhaus, K. -H., Schneider, G. (2008) Scaffold diversity of natural products: inspiration for combinatorial library design. Nat Prod Rep 25, 892–904.

    Article  PubMed  CAS  Google Scholar 

  84. Stocks, M. J., Wilden, G. R. H, Pairaudeau, G., Perry, M. W. D, Steele, J., Stonehous, J. P. (2009) A practical method for targeted library design balancing lead-like properties with diversity. ChemMedChem 4, 800–808.

    Article  PubMed  Google Scholar 

  85. Hann, M. M., Leach, A. R., Harper, G. (2001) Molecular complexity and its impact on the probability of finding leads for drug discovery. J Chem Inf Comput Sci 41, 856–864.

    Article  PubMed  CAS  Google Scholar 

  86. Gillet, V. J. (2002) Reactant- and product-based approaches to the design of combinatorial libraries. J Comput Aided Mol Des 16:371–380.

    Article  PubMed  CAS  Google Scholar 

  87. Balakin, K. V., Ivanenkov, Y. A., Savchuk, N. P. (2009) Compound library design for targeted families, in (Jacoby, E. ed.) Chemogenomics. Humana Press, New York, pp 21–46.

    Chapter  Google Scholar 

  88. Xi, H., Lunney, E. A. (2010) The design, annotation and application of a kinase-targeted-library, in (Zhou, J. Z. ed.) Chemical Library Design. Humana Press, New York, Chapter 14.

    Google Scholar 

Download references

Acknowledgment

The chapter was prepared when the author was visiting with professor Andy McCammon’s group. The author is very grateful to Professor Andy McCammon and his group for the exciting and stimulating scientific environment during the preparation of the chapter.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joe Zhongxiang Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Science+Business Media, LLC

About this protocol

Cite this protocol

Zhou, J.Z. (2011). Chemoinformatics and Library Design. In: Zhou, J. (eds) Chemical Library Design. Methods in Molecular Biology, vol 685. Humana Press. https://doi.org/10.1007/978-1-60761-931-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-931-4_2

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-930-7

  • Online ISBN: 978-1-60761-931-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics