Abstract
In this chapter, we describe an algorithm for the design of lead-generation libraries required in combinatorial drug discovery. This algorithm addresses simultaneously the two key criteria of diversity and representativeness of compounds in the resulting library and is computationally efficient when applied to a large class of lead-generation design problems. At the same time, additional constraints on experimental resources are also incorporated in the framework presented in this chapter. A computationally efficient scalable algorithm is developed, where the ability of the deterministic annealing algorithm to identify clusters is exploited to truncate computations over the entire dataset to computations over individual clusters. An analysis of this algorithm quantifies the trade-off between the error due to truncation and computational effort. Results applied on test datasets corroborate the analysis and show improvement by factors as large as ten or more depending on the datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gordon, E. M., Barrett, R. W., Dower, W. J., Fodor, S. P. A., Gallop, M. A. (1994) Applications of combinatorial technologies to drug discovery. 2. Combinatorial organic synthesis, library screening strategies, and future directions. J Med Chem 37(10), 1385–1401.
Blaney, J., Martin, E. (1997) Computational approaches for combinatorial library design and molecular diversity analysis. Curr Opin Chem Biol 1, 54–59.
Willett, P. (1997) Computational tools for the analysis of molecular diversity. Perspect Drug Discov Design, 7/8, 1–11.
Rassokhin, D. N., Agrafiotis, D. K. (2000) Kolmogorov-Smirnov statistic and its applications in library design. J Mol Graph Model 18(4–5), 370–384.
Lipinski, C. A., Lomabardo, F., Dominy, B. W., Feeny, P. J. (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development setting. Adv Drug Del Review 23, 2–25.
Higgs, R. E., Bemis, K. G., Watson, I. A., Wikel, J. H. (1997) Experimental designs for selecting molecules from large chemical databases. J Chem Inf Comput Sci 37, 861–870.
Clark, R. D. (1997) Optisim: an extended dissimilarity selection method for finding diverse representative subsets. J Chem Inf Comput Sci 37(6), 1181–1188.
Agrafiotis, D. K., Lobanov, V. S. (2000) Ultrafast algorithm for designing focussed combinatorial arrays. J Chem Inf Comput Sci 40, 1030–1038.
Salapaka, S., Khalak, A. (2003) Constraints on locational optimization problems. Proceedings of the IEEE Control and Decisions Conference. Maui, HI, 9–12 December 2003, pp. 1741–1746.
Sharma, P., Salapaka, S., Beck, C. (2008) A scalable approach to combinatorial library design for drug discovery. J Chem Inf Model 48(1), 27–41.
Gersho, A., Gray, R. (1991) Vector Quantization and Signal Compression. Kluwer, Boston, Massachusetts.
Drezner, Z. (1995) Facility location: a survey of applications and methods. Springer Series in Operations Research, Springer, New York.
Du, Q., Faber, V., Gunzburger, M. (1999) Centroidal Voronoi tessellations: applications and algorithms. SIAM Rev 41(4), 637–676.
Therrien, C. W. (1989) Decision, Estimation and Classification: An Introduction to Pattern Recognition and Related Topics, 1st ed. Wiley, New York.
Haykin, S. (1998) Neural Networks: A Comprehensive Foundation, Prentice Hall, Englewoods Cliffs, NJ.
Gray, R., Karnin, E. D. (1982) Multiple local minima in vector quantizers. IEEE Trans Inform Theor 28, 256–361.
Lloyd, S. P. (1982) Least squares quantization in PCM. IEEE Trans Inform Theory 28(2), 129–137.
Rose, K. (1998) Deterministic annealing for clustering, compression, classification, regression and related optimization problems. Proc IEEE 86(11), 2210–2239.
Mcmaster hts lab competition. HTS data mining and docking competition. http://hts.mcmaster.ca/downloads/82bfbeb4-f2a4-4934-b6a8-804cad8e25a0.html (accessed June 2006).
Guha, R. (2006) Chemistry Development Kit (CDK) descriptor calculator GUI (v 0.46). http://cheminfo.informatics.indiana.edu/rguha/code/java/cdkdesc.html (accessed October 2006).
Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., Willighagen, E. L. (2006) Recent developments of the Chemistry Development Kit (CDK) – an open-source JAVA library for chemo and bioinformatics. Curr Pharm Des 12(17), 2110–2120.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Science+Business Media, LLC
About this protocol
Cite this protocol
Sharma, P., Salapaka, S., Beck, C. (2011). A Scalable Approach to Combinatorial Library Design. In: Zhou, J. (eds) Chemical Library Design. Methods in Molecular Biology, vol 685. Humana Press. https://doi.org/10.1007/978-1-60761-931-4_4
Download citation
DOI: https://doi.org/10.1007/978-1-60761-931-4_4
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-930-7
Online ISBN: 978-1-60761-931-4
eBook Packages: Springer Protocols