Abstract
With the increasing importance of big data in biomedicine, skills in data science are a foundation for the individual career development and for the progress of science. This chapter is a practical guide to working with high-throughput biomedical data. It covers how to understand and set up the computing environment, to start a research project with proper and effective data management, and to perform common bioinformatics tasks such as data wrangling, quality control, statistical analysis, and visualization, with examples on metabolomics data. Concepts and tools related to coding and scripting are discussed. Version control, knitr and Jupyter notebooks are important to project management, collaboration, and research reproducibility. Overall, this chapter describes a core set of skills to work in bioinformatics, and can serve as a reference text at the level of a graduate course and interfacing with data science.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zauhar RJ (2001) University bioinformatics programs on the rise. Nat Biotechnol 19(3):285
Gilbert W (2003) Life after the helix. Nature 421:315–316
De Livera AM, Olshansky G, Simpson JA, Creek DJ (2018) NormalizeMets: assessing, selecting and implementing statistical methods for normalizing metabolomics data. Metabolomics 14(5):54
Gardinassi LG, Xia J, Safo SE, Li S (2017) Bioinformatics tools for the interpretation of metabolomics data. Curr Pharmacol Rep 3(6):374–383
Krzywinski M et al (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645
Li S, Park Y, Duraisingham S, Strobel FH, Khan N, Soltow QA, Jones DP, Pulendran B (2013) Predicting network activity from high throughput metabolomics. PLoS Comput Biol 9(7):e1003123
Acknowledgments
This work has been funded, in part, by the US national Institutes of Health via grants UH2 AI132345 (Li), U2C ES030163 (Jones, Li, Morgan, Miller), U01 CA235493 (Li, Xia, Siuzdak), U2C ES026560 (Miller), P30 ES019776 (Marsit), P50 ES026071 (McCauley), and the US EPA grant 83615301 (McCauley).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Pittard, W.S., Villaveces, C.“., Li, S. (2020). A Bioinformatics Primer to Data Science, with Examples for Metabolomics. In: Li, S. (eds) Computational Methods and Data Analysis for Metabolomics. Methods in Molecular Biology, vol 2104. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0239-3_14
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0239-3_14
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0238-6
Online ISBN: 978-1-0716-0239-3
eBook Packages: Springer Protocols