Abstract
High-throughput screening (HTS) techniques are increasingly being adopted by a variety of fields of toxicology. Notably, large-scale research efforts from government, industrial, and academic laboratories are screening millions of chemicals against a variety of biomolecular targets, producing an enormous amount of publicly available HTS assay data. These HTS assay data provide toxicologists important information on how chemicals interact with different biomolecular targets and provide illustrations of potential toxicity mechanisms. Open public data repositories, such as the National Institutes of Health’s PubChem (http://pubchem.ncbi.nlm.nih.gov), were established to accept, store, and share HTS data. Through the PubChem website, users can rapidly obtain the PubChem assay results for compounds by using different chemical identifiers (including SMILES, InChIKey, IUPAC names, etc.). However, obtaining these data in a user-friendly format suitable for modeling and other informatics analysis (e.g., gathering PubChem data for hundreds or thousands of chemicals in a modeling friendly format) directly through the PubChem web portal is not feasible. This chapter aims to introduce two approaches to obtain the HTS assay results for large datasets of compounds from the PubChem portal. First, programmatic access via PubChem’s PUG-REST web service using the Python programming language will be described. Second, most users, who lack programming skills, can directly obtain PubChem data for a large set of compounds by using the freely available Chemical In vitro–In vivo Profiling (CIIPro) portal (http://www.ciipro.rutgers.edu).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
National Research Council (2007) Toxicity testing in the 21st century: a vision and a strategy. The National Academies Press, Washington, DC
Wang Y, Xiao J, Suzek TO et al (2009) PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res 37:W623–W633. https://doi.org/10.1093/nar/gkp456
Gaulton A, Bellis LJ, Bento AP et al (2011) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res:gkr777. https://doi.org/10.1093/nar/gkr777
Liu T, Lin Y, Wen X et al (2007) BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res 35:D198–D201. https://doi.org/10.1093/nar/gkl999
Mattingly CJ, Rosenstein MC, Colby GT et al (2006) The comparative Toxicogenomics database (CTD): a resource for comparative toxicological studies. J Exp Zool A Comp Exp Biol 305:689–692. https://doi.org/10.1002/jez.a.307
Kim S, Chen J, Cheng T et al (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49:D1388–D1395. https://doi.org/10.1093/nar/gkaa971
Russo DP, Strickland J, Karmaus AL et al (2019) Nonanimal models for acute toxicity evaluations: applying data-driven profiling and read-across. Environ Health Perspect 127:047001. https://doi.org/10.1289/EHP3614
Kim MT, Huang R, Sedykh A et al (2015) Mechanism profiling of hepatotoxicity caused by oxidative stress using antioxidant response element reporter gene assay models and big data. Environ Health Perspect 124:634–641. https://doi.org/10.1289/ehp.1509763
Ribay K, Kim MT, Wang W et al (2016) Predictive modeling of estrogen receptor binding agents using advanced cheminformatics tools and massive public data. Front Environ Sci 12. https://doi.org/10.3389/fenvs.2016.00012
Zhao L, Russo DP, Wang W et al (2020) Mechanism-driven read-across of chemical hepatotoxicants based on chemical structures and biological data. Toxicol Sci 174:178–188. https://doi.org/10.1093/toxsci/kfaa005
Kim S, Thiessen PA, Bolton EE, Bryant SH (2015) PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem. Nucleic Acids Res 43:W605–W611. https://doi.org/10.1093/nar/gkv396
Kim S, Thiessen PA, Cheng T et al (2018) An update on PUG-REST: RESTful interface for programmatic access to PubChem. Nucleic Acids Res 46:W563–W570. https://doi.org/10.1093/nar/gky294
Russo DP, Kim MT, Wang W et al (2017) CIIPro: a new read-across portal to fill data gaps using public large-scale chemical and biological data. Bioinformatics 33:464–466. https://doi.org/10.1093/bioinformatics/btw640
Kim S, Thiessen PA, Cheng T et al (2019) PUG-view: programmatic access to chemical annotations integrated in PubChem. J Cheminform 11:56. https://doi.org/10.1186/s13321-019-0375-2
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply
About this protocol
Cite this protocol
Russo, D.P., Zhu, H. (2022). High-Throughput Screening Assay Profiling for Large Chemical Databases. In: Zhu, H., Xia, M. (eds) High-Throughput Screening Assays in Toxicology. Methods in Molecular Biology, vol 2474. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2213-1_12
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2213-1_12
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2212-4
Online ISBN: 978-1-0716-2213-1
eBook Packages: Springer Protocols