Abstract
lazar is a new tool for the prediction of toxic properties of chemical structures. It derives predictions for query structures from a database with experimentally determined toxicity data. lazar generates predictions by searching the database for compounds that are similar with respect to a given toxic activity and calculating the prediction from their activities. Apart form the prediction, lazar provides the rationales (structural features and similar compounds) for the prediction and a reliable condence index that indicates, if a query structure falls within the applicability domain of the training database.
Leave-one-out (LOO) crossvalidation experiments were carried out for 10 carcinogenicity endpoints ({female|male} {hamster|mouse|rat} carcinogenicity and aggregate endpoints {hamster|mouse|rat} carcinogenicity and rodent carcinogenicity) and Salmonella mutagenicity from the Carcinogenic Potency Database (CPDB). An external validation of Salmonella mutagenicity predictions was performed with a dataset of 3895 structures. Leave-one-out and external validation experiments indicate that Salmonella mutagenicity can be predicted with 85% accuracy for compounds within the applicability domain of the CPDB. The LOO accuracy of lazar predictions of rodent carcinogenicity is 86%, the accuracies for other carcinogenicity endpoints vary between 78 and 95% for structures within the applicability domain.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Abbreviations
- CCRIS:
-
chemical carcinogenesis research information system
- CPDB:
-
carcinogenic potency database
- DSSTox:
-
distributed structure-searchable toxicity project
- lazar:
-
lazy structure-activity relationships
- LOO:
-
leave-one-out crossvalidation
- k-nn:
-
k-nearest-neighbours
- (Q)SAR:
-
(quantitative) structure-activity relationships
References
Helma, C. (Ed.)., Predictive Toxicology, Taylor & Francis, Boca Raton (2005).
Eriksson, L., Johansson, E. and Lundstedt, T. Regression- and projection-based approaches in Predictive Toxicology, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 177–222.
Parsons, S. and McBurney, P. The use of expert systems for toxicology risk prediction, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 135–176.
Kramer, S. and Helma, C. Machine learning and data mining, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 223–254.
Imielinski, T. and Mannila, H., A database perspective on knowledge discovery, Communications of the ACM, 39 (1996) 58–64.
DeRaedt, L., A perspective on inductive databases, SIGKDD Explorations, 4 (2002) 69–77.
Toivonen, H., Srinivasan, A., King, R.D., Kramer, S. and Helma, C., Statistical evaluation of the Predictive Toxicology Challenge 2000–2001, Bioinformatics, 19 (2003) 1183–1193.
Benigni, R. and Zito, R., The second national toxicology program comparative exercise on the prediction of rodent carcinogenicity: Denitive results. Mutation Res., 566 (2004) 49–63.
Benigni, R., Structure–activity relationship studies of chemical mutagens and carcinogens: mechanistic investigations and prediction approaches, Chemical Reviews, in press (2005).
Helma, C., Data mining and knowledge discovery in predictive toxicology, SAR QSAR Environ. Res., 15 (2004) 367–383.
Helma, C., lazar: Lazy structure – activity relationships for toxicity prediction, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 479–499.
Willett, P., Barnard, J. and Downs, G., Chemical similarity searching, J. Chem. Inf. Comput. Sci., 38 (1998) 983–996.
Kramer, S., De Raedt, L. and Helma, C., Molecular feature mining in HIV data, in Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01) (2001) pp. 136–143.
Hill, A., Erweiterung des Molecular Feature Miners für 3-dimensionale Fragmente, Master's thesis, Universität Freiburg (2002).
Molzberger, L., Development of a method to search efficiently for frequent substructures in large molecule databases, Master's thesis, Universität Freiburg (2004).
Poroikov, V. and Filimonov, D., Pass: Prediction of biological activity for substances, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 459–478.
Varnek, A. and Solov'ev, V., “in silico” design of potential anti-HIV actives using fragment descriptors, Comb. Chem. High. Throughput Screen., 8 (2005) 403–416.
Coles, S., Day, N., Murray-Rust, P., Rzepa, H. and Zhang, Y., Enhancement of the chemical semantic web through the use of InChI identifiers, Org. Biomol. Chem., 3 (2005) 1832–1834.
Hawkins, D., The problem of overfitting, J. Chem. Inf. Comput. Sci., 44 (2004) 1–12.
Kazius, J., McGuire, R. and Bursi, R., Derivation and vlaidation of toxicophores for mutagenicity prediction, J. Med. Chem., 48 (2005) 312–320.
Witten, I. and Frank, E., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann Publishers, San Francisco, California (2000).
Helma, C., Kramer, T., Kramer, S. and DeRaedt, L., Data Mining and Machine Learning techniques for the identification of mutagenicity inducing substructures and structure–activity relationships of noncongeneric compounds. J. Chem. Inf. Comput. Sci., 44 (2004) 1402–1411.
Benigni, R., Qsar prediction of rodent carcinogenicity for a set of chemicals currently bioassayed by the us national toxicology program, Mutagenesis, 6 (1991) 423–425.
Benigni, R., Predicting chemical carcinogenesis in rodents: The state of art in light of a comparative exercise, Mutation Res., 334 (1995) 103–113.
Woo, Y. and Lai, D.Y., Mechanism of action of chemical carcinogens and their role in structure-activity relationship (SAR) analysis and risk assessment, in Benigni, R. (Ed.)., Quantitative Structure–Activity Relationship (QSAR) Models of Mutagens and Carcinogens. CRC Press, Boca Raton (2003) pp. 41–80.
Gottmann, E., Kramer, S., Pfahringer, B. and Helma, C., Data quality in predictive toxicology: Reproducibility of rodent carcinogenicity experiments, Environ. Health Perspect., 109 (2001) 509–514.
Benigni, R. and Giuliani, A., Putting the Predictive Toxicology Challenge into prespective: Reflections on the results, Bioinformatics, 19 (2003) 1194–1200.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Helma, C. Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity. Mol Divers 10, 147–158 (2006). https://doi.org/10.1007/s11030-005-9001-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-005-9001-5