Abstract
Efficient and comprehensive data management is an indispensable component of modern scientific research and requires effective tools for all but the most trivial experiments. The LabDB system developed and used in our laboratory was originally designed to track the progress of a structure determination pipeline in several large National Institutes of Health (NIH) projects. While initially designed for structural biology experiments, its modular nature makes it easily applied in laboratories of various sizes in many experimental fields. Over many years, LabDB has transformed into a sophisticated system integrating a range of biochemical, biophysical, and crystallographic experimental data, which harvests data both directly from laboratory instruments and through human input via a web interface. The core module of the system handles many types of universal laboratory management data, such as laboratory personnel, chemical inventories, storage locations, and custom stock solutions. LabDB also tracks various biochemical experiments, including spectrophotometric and fluorescent assays, thermal shift assays, isothermal titration calorimetry experiments, and more. LabDB has been used to manage data for experiments that resulted in over 1200 deposits to the Protein Data Bank (PDB); the system is currently used by the Center for Structural Genomics of Infectious Diseases (CSGID) and several large laboratories. This chapter also provides examples of data mining analyses and warnings about incomplete and inconsistent experimental data. These features, together with its capabilities for detailed tracking, analysis, and auditing of experimental data, make the described system uniquely suited to inspect potential sources of irreproducibility in life sciences research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Data management. http://www.businessdictionary.com/definition/data-management.html. Accessed 6 May 2019
Freedman LP, Cockburn IM, Simcoe TS (2015) The economics of reproducibility in preclinical research. PLoS Biol 13(6):e1002165
Prinz F, Schlange T, Asadullah K (2011) Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov 10(9):712–7c1
Begley CG, Ioannidis JP (2015) Reproducibility in science: improving the standard for basic and preclinical research. Circ Res 116(1):116–126
Collins FS, Tabak LA (2014) Policy: NIH plans to enhance reproducibility. Nature 505(7485):612–613
McDowall RD, Pearce JC, Murkitt GS (1988) Laboratory information management systems—Part I. Concepts. J Pharm Biomed Anal 6(4):339–359
Hakkinen J, Levander F (2011) Laboratory data and sample management for proteomics. Methods Mol Biol 696:79–92
Hunter A, Dayalan S, De Souza D, Power B, Lorrimar R, Szabo T et al (2017) MASTR-MS: a web-based collaborative laboratory information management system (LIMS) for metabolomics. Metabolomics 13(2):14016-1142-2. Epub 2016 Dec 27
Lin K, Kools H, de Groot PJ, Gavai AK, Basnet RK, Cheng F et al (2011) MADMAX - management and analysis database for multiple ~omics experiments. J Integr Bioinform 8(2):160,jib-2011-160
Stephan C, Kohl M, Turewicz M, Podwojski K, Meyer HE, Eisenacher M (2010) Using Laboratory Information Management Systems as central part of a proteomics data workflow. Proteomics 10(6):1230–1249
Venco F, Vaskin Y, Ceol A, Muller H (2014) SMITH: a LIMS for handling next-generation sequencing workflows. BMC Bioinformatics 15(Suppl 14):S3. Epub 2014 Nov 27
Harris M, Jones TA (2002) Xtrack - a web-based crystallographic notebook. Acta Crystallogr D Biol Crystallogr 58(Pt 10 Pt 2):1889–1891
Lab Information Management Systems (LIMS). https://www.thermofisher.com/us/en/home/life-science/lab-data-management-analysis-software/enterprise-level-lab-informatics/lab-information-management-systems-lims.html. Accessed 25 Apr 2019
Laboratory Information Management System (LIMS). https://www.autoscribeinformatics.com/lims-laboratory-information-management-system. Accessed 6 May 2019
Produce reliable results more quickly. https://www.illumina.com/informatics/sample-experiment-management/lims.html. Accessed 25 Apr 2019
St. Cyr K, Hill A, Warren P, Mounts D, Whitley M, Mounts W et al (2010) From project-to-peptides: customizing a commercial LIMS for LC-MS proteomics. J Biomol Tech 21(3):S9
Zolnai Z, Lee PT, Li J, Chapman MR, Newman CS, Phillips GN Jr et al (2003) Project management system for structural and functional proteomics: SESAME. J Struct Funct Genom 4(1):11–23
Morris C (2015) PiMS: a data management system for structural proteomics. Methods Mol Biol 1261:21–34
Daniel E, Lin B, Diprose JM, Griffiths SL, Morris C, Berry IM et al (2011) xtalPiMS: a PiMS-based web application for the management and monitoring of crystallization trials. J Struct Biol 175(2):230–235
Prilusky J, Oueillet E, Ulryck N, Pajon A, Bernauer J, Krimm I et al (2005) HalX: an open-source LIMS (Laboratory Information Management System) for small- to large-scale laboratories. Acta Crystallogr D Biol Crystallogr 61(Pt 6):671–678
Bonanno JB, Almo SC, Bresnick A, Chance MR, Fiser A, Swaminathan S et al (2005) New York-Structural GenomiX Research Consortium (NYSGXRC): a large scale center for the protein structure initiative. J Struct Funct Genom 6(2–3):225–232
Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR et al (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr D 67(Pt 4):235–242
Potterton L, Agirre J, Ballard C, Cowtan K, Dodson E, Evans PR et al (2018) CCP4i2: the new graphical user interface to the CCP4 program suite. Acta Crystallogr D Struct Biol 74(Pt 2):68–84
Adams PD, Afonine PV, Bunkoczi G, Chen VB, Davis IW, Echols N et al (2010) PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D66(Pt 2):213–221
Echols N, Grosse-Kunstleve RW, Afonine PV, Bunkoczi G, Chen VB, Headd JJ et al (2012) Graphical tools for macromolecular crystallography in PHENIX. J Appl Crystallogr 45(Pt 3):581–586
Minor W, Cymborowski M, Otwinowski Z, Chruszcz M (2006) HKL-3000: the integration of data reduction and structure solution - from diffraction images to an initial model in minutes. Acta Crystallogr D Biol Crystallogr D62:859–866
Cymborowski M, Klimecka M, Chruszcz M, Zimmerman MD, Shumilin IA, Borek D et al (2010) To automate or not to automate: this is the question. J Struct Funct Genom 11(3):211–221
Zimmerman MD, Grabowski M, Domagalski MJ, MacLean EM, Chruszcz M, Minor W (2014) Data management in the modern structural biology and biomedical research environment. Methods Mol Biol 1140:1–25
Zimmerman MD, Chruszcz M, Koclega K, Otwinowski Z, Minor W (2005) The Xtaldb system for project salvaging in high-throughput crystallization. Acta Crystallogr A 61:c178–c179
Zimmerman MD (2008) The crystallization expert system Xtaldb, and its application to the structure of the 5′- nucleotidase YfbR and other proteins [dissertation]. University of Virginia, Charlottesville
Chruszcz M, Wlodawer A, Minor W (2008) Determination of protein structures—a series of fortunate events. Biophys J 95(1):1–9
Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31–36
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A et al (2016) PubChem Substance and Compound databases. Nucleic Acids Res 44(D1):D1202–D1213
Formulatrix. https://formulatrix.com/. Accessed 6 May 2019
Newman J (2005) Expanding screening space through the use of alternative reservoirs in vapor-diffusion experiments. Acta Crystallogr D Biol Crystallogr 61(Pt 4):490–493
Cooper DR, Boczek T, Grelewska K, Pinkowska M, Sikorska M, Zawadzki M et al (2007) Protein crystallization by surface entropy reduction: optimization of the SER strategy. Acta Crystallogr D Biol Crystallogr 63(Pt 5):636–645
CakePHP. https://cakephp.org/. Accessed 6 May 2019
Shabalin IG, Porebski PJ, Minor W (2018) Refining the macromolecular model - achieving the best agreement with the data from X-ray diffraction experiment. Crystallogr Rev 24(4):236–262
Czub MP, Venkataramany BS, Majorek KA, Handing KB, Porebski PJ, Beeram SR et al (2018) Testosterone meets albumin - the molecular mechanism of sex hormone transport by serum albumins. Chem Sci 10(6):1607–1618
Majorek KA, Porebski PJ, Dayal A, Zimmerman MD, Jablonska K, Stewart AJ et al (2012) Structural and immunologic characterization of bovine, horse, and rabbit serum albumins. Mol Immunol 52(3–4):174–182
Svare A, Nilsen TI, Asvold BO, Forsmo S, Schei B, Bjoro T et al (2013) Does thyroid function influence fracture risk? Prospective data from the HUNT2 study, Norway. Eur J Endocrinol 169(6):845–852
Majorek KA, Kuhn ML, Chruszcz M, Anderson WF, Minor W (2014) Double trouble-buffer selection and His-tag presence may be responsible for nonreproducibility of biomedical experiments. Protein Sci 23(10):1359–1368
How a typo in a catalog number led to the correction of a scientific paper—and what we can learn from that. https://retractionwatch.com/2018/10/18/how-a-typo-in-a-catalog-number-led-to-the-correction-of-a-scientific-paper-and-what-we-can-learn-from-that/. Accessed 8 May 2019
Acknowledgments
We thank all the users of our data management programs who over many years provided us with numerous complaints, suggestions, and requests that gave us invaluable feedback to improve our tools. This work was supported by the National Institute of General Medical Sciences under Grants GM117080 and GM117325, National Institutes of Health BD2K program under grant HG008424, and the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under Contract No. HHSN272201700060C and HHSN272201200026C.
Disclosure statement: One of the authors (W.M.) notes that he has also been involved in the development of state-of-the-art software and data management and mining tools; some of them were commercialized by HKL Research, Inc. and are mentioned in the paper. W.M. is the co-founder of HKL Research, Inc. and a member of the board.
The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Cooper, D.R. et al. (2021). State-of-the-Art Data Management: Improving the Reproducibility, Consistency, and Traceability of Structural Biology and in Vitro Biochemical Experiments. In: Chen, Y.W., Yiu, CP.B. (eds) Structural Genomics. Methods in Molecular Biology, vol 2199. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0892-0_13
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0892-0_13
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0891-3
Online ISBN: 978-1-0716-0892-0
eBook Packages: Springer Protocols