Abstract
The Protein Data Bank (PDB)––the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes––was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Protein Data Bank (1971) Protein Data Bank. Nature New Biology 233:223
Kendrew JC, Bodo G, Dintzis HM et al (1958) A three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature 181:662–666
Kendrew JC, Dickerson RE, Strandberg BE et al (1960) Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature 185:422–427
Bolton W, Perutz MF (1970) Three dimensional fourier synthesis of horse deoxyhaemoglobin at 2.8 Ångstrom units resolution. Nature 228:551–552
Perutz MF, Rossmann MG, Cullis AF et al (1960) Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by X-ray analysis. Nature 185:416–422
Cold Spring Laboratory (1972) Cold Spring Harbor Symposia on quantitative biology, vol 36. Cold Spring Laboratory Press, Cold Spring Harbor, NY
Berman H (2008) The Protein Data Bank: a historical perspective. Acta Crystallogr A 64:88–95
Meyer EF (1997) The first years of the Protein Data Bank. Protein Sci 6:1591–1597
International Union of Crystallography (1989) Policy on publication and the deposition of data from crystallographic studies of biological macromolecules. Acta Crystallogr A 45:658
Sussman JL, Lin D, Jiang J et al (1998) Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. Acta Crystallogr D Biol Crystallogr 54:1078–1084
Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242
Standley DM, Kinjo AR, Kinoshita K et al (2008) Protein structure databases with new web services for structural biology and biomedical research. Brief Bioinform 9:276–285
Keller PA, Henrick K, McNeil P et al (1998) Deposition of macromolecular structures. Acta Crystallogr D Biol Crystallogr 54:1105–1108
Velankar S, van Ginkel G, Alhroub Y et al (2016) PDBe: improved accessibility of macromolecular structure data from PDB and EMDB. Nucleic Acids Res 44:D385–D395
Berman HM, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10:980
Ulrich EL, Markley JL, Kyogoku Y (1989) Creation of a nuclear magnetic resonance data repository and literature database. Protein Seq Data Anal 2:23–37
Markley JL, Ulrich EL, Berman HM et al (2008) BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): new policies affecting biomolecular NMR depositions. J Biomol NMR 40:153–155
Ulrich EL, Akutsu H, Doreleijers JF et al (2008) BioMagResBank. Nucleic Acids Res 36:D402–D408
Velankar S, Best C, Beuth B et al (2010) PDBe: Protein Data Bank in Europe. Nucleic Acids Res 38:D308–D317
Lin D, Manning NO, Jiang J et al (2000) AutoDep: a web-based system for deposition and validation of macromolecular structural information. Acta Crystallogr D Biol Crystallogr 56:828–841
Tagari M, Tate J, Swaminathan GJ et al (2006) E-MSD: improving data deposition and structure quality. Nucleic Acids Res 34:D287–D290
Read RJ, Adams PD, Arendall WB et al (2011) A new generation of crystallographic validation tools for the Protein Data Bank. Structure 19:1395–1412
Montelione GT, Nilges M, Bax A et al (2013) Recommendations of the wwPDB NMR Validation Task Force. Structure 21:1563–1570
Henderson R, Sali A, Baker ML et al (2012) Outcome of the first electron microscopy validation task force meeting. Structure 20:205–214
Berman HM, Burley SK, Chiu W et al (2006) Outcome of a workshop on archiving structural models of biological macromolecules. Structure 14:1211–1217
Arnold K, Kiefer F, Kopp J et al (2009) The Protein Model Portal. J Struct Funct Genom 10:1–8
Trewhella J, Hendrickson WA, Kleywegt GJ et al (2013) Report of the wwPDB Small-Angle Scattering Task Force: data requirements for biomolecular modeling and the PDB. Structure 21:875–881
Valentini E, Kikhney AG, Previtali G et al (2015) SASBDB, a repository for biological small-angle scattering data. Nucleic Acids Res 43:D357–D363
Groom CR, Bruno IJ, Lightfoot MP et al (2016) The Cambridge Structural Database. Acta Crystallogr B 72:171–179
Adams PD, Aertgeerts K, Bauer C et al (2016) Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 24:502–508
Meyer PA, Socias S, Key J et al (2016) Data publication with the structural biology data grid supports live analysis. Nature Commun 7:10882
Markley JL, Ulrich EL, Westler WM et al (2003) Macromolecular structure determination by NMR spectroscopy. In: Bourne PE, Weissig H (eds) Structural bioinformatics. John Wiley & Sons, Inc., Hoboken, NJ, pp 89–113
Lawson CL, Patwardhan A, Baker ML et al (2016) EMDataBank unified data resource for 3DEM. Nucleic Acids Res 44:D396–D403
Iudin A, Korir PK, Salavert-Torres J et al (2016) EMPIAR: a public archive for raw electron microscopy image data. Nat Methods 13:387
Bernstein FC, Koetzle TF, Williams GJB et al (1977) Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol 112:535–542
Fitzgerald PMD, Westbrook JD, Bourne PE et al (2005) 4.5 Macromolecular dictionary (mmCIF). In: Hall SR, McMahon B (eds) International Tables for Crystallography G. Definition and exchange of crystallographic data. Springer, Dordrecht, The Netherlands, pp 295–443
Westbrook JD, Henrick K, Ulrich EL et al (2005) Appendix 3.6.2. The Protein Data Bank Exchange Data Dictionary. In: Hall SR, McMahon B (eds) International Tables for Crystallography G. Definition and exchange of crystallographic data. Springer, Dordrecht, The Netherlands, pp 195–198
Westbrook J, Ito N, Nakamura H et al (2005) PDBML: the representation of archival macromolecular structure data in XML. Bioinformatics 21:988–992
Kinjo AR, Suzuki H, Yamashita R et al (2012) Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format. Nucleic Acids Res 40:D453–D460
Yokochi M, Kobayashi N, Ulrich EL et al (2016) Publication of nuclear magnetic resonance experimental data with semantic web technology and the application thereof to biomedical research of proteins. J Biomed Semantics 7:16
Malfois M, Svergun DI (2000) sasCIF: an extension of core Crystallographic Information File for SAS. J Appl Crystallogr 33:812–816
Ulrich EL, Argentar D, Klimowicz A et al (1996) STAR/CIF macromolecular NMR data dictionaries and data file formats. Acta Crystallogr A 52:C577–C577
Berman HM, Henrick K, Nakamura H et al (2009) The Worldwide Protein Data Bank. In: Gu J, Bourne PE (eds) Structural bioinformatics, 2nd edn. Wiley, Hoboken, NJ, pp 293–303
Doreleijers JF, Vranken WF, Schulte C et al (2012) NRG-CING: integrated validation reports of remediated experimental biomolecular NMR data and coordinates in wwPDB. Nucleic Acids Res 40:D519–D524
Doreleijers JF, Vranken WF, Schulte C et al (2009) The NMR restraints grid at BMRB for 5,266 protein and nucleic acid PDB entries. J Biomol NMR 45:389–396
Gutmanas A, Adams PD, Bardiaux B et al (2015) NMR Exchange Format: a unified and open standard for representation of NMR restraint data. Nat Struct Mol Biol 22:433–434
Westbrook JD, Shao C, Feng Z et al (2015) The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank. Bioinformatics 31:1274–1278
Dutta S, Dimitropoulos D, Feng Z et al (2014) Improving the representation of peptide-like inhibitor and antibiotic molecules in the Protein Data Bank. Biopolymers 101:659–668
UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212
Caboche S, Pupin M, Leclere V et al (2008) NORINE: a database of nonribosomal peptides. Nucleic Acids Res 36:D326–D331
Haas J, Roth S, Arnold K et al (2013) The Protein Model Portal—a comprehensive resource for protein structure and model information. Database 2013:bat031
Prischi F, Pastore A (2016) Application of nuclear magnetic resonance and hybrid methods to structure determination of complex systems. Adv Exper Med Biol 896:351–368
Cornilescu G, Didychuk AL, Rodgers ML et al (2016) Structural analysis of multi-helical RNAs by NMR-SAXS/WAXS: application to the U4/U6 di-snRNA. J Mol Biol 428:777–789
Venditti V, Egner TK, Clore GM (2016) Hybrid approaches to structural characterization of conformational ensembles of complex macromolecular systems combining NMR residual dipolar couplings and solution X-ray scattering. Chem Rev 116:6305–6322
Erzberger JP, Stengel F, Pellarin R et al (2014) Molecular architecture of the 40SeIF1eIF3 translation initiation complex. Cell 158:1123–1135
Sali A, Berman HM, Schwede T et al (2015) Outcome of the First wwPDB Hybrid/Integrative Methods Task Force Workshop. Structure 23:1156–1167
Acknowledgments
The RCSB PDB is supported by the National Science Foundation (DBI 1338415), National Institutes of Health, and the Department of Energy; PDBe by the Wellcome Trust, BBSRC, MRC, EU, CCP4 , and EMBL-EBI; PDBj by JST-NBDC; and BMRB by the National Institute of General Medical Sciences (GM109046). We thank Christine Zardecki for expert help with manuscript preparation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Burley, S.K., Berman, H.M., Kleywegt, G.J., Markley, J.L., Nakamura, H., Velankar, S. (2017). Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive. In: Wlodawer, A., Dauter, Z., Jaskolski, M. (eds) Protein Crystallography. Methods in Molecular Biology, vol 1607. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7000-1_26
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7000-1_26
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6998-2
Online ISBN: 978-1-4939-7000-1
eBook Packages: Springer Protocols