Abstract
Years of meticulous curation of scientific literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein interaction data. Over the years, these repositories have become a basic framework in which experiments are analyzed and new directions of research are explored. Here we present an overview of the most widely used protein-protein interaction databases and the methods they employ to gather, combine, and predict interactions. We also point out the trade-off between comprehensiveness and accuracy and the main pitfall scientists have to be aware before adopting protein interaction databases in any single-gene or genome-wide analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Croft D, O’Kelly G, Wu G, Haw R et al (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39:D691–D697
Kanehisa M, Goto S, Furumichi M, Tanabe M et al (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38:D355–D360
Kerrien S, Aranda B, Breuza L, Bridge A et al (2012) The IntAct molecular interaction database in 2012. Nucleic Acids Res 40:D841–D846
Stark C, Breitkreutz B-J, Chatr-Aryamontri A, Boucher L et al (2011) The BioGRID Interaction Database: 2011 update. Nucleic Acids Res 39:D698–D704
Szklarczyk D, Franceschini A, Kuhn M, Simonovic M et al (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39:D561–D568
Warde-Farley D, Donaldson SL, Comes O, Zuberi K et al (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38:W214–W220
Goel R, Harsha HC, Pandey A, Prasad TSK (2012) Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis. Mol Biosyst 8:453–463
Cherry JM, Hong EL, Amundsen C, Balakrishnan R et al (2012) Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40:D700–D705
Murali T, Pacifico S, Yu J, Guest S et al (2011) DroID 2011: a comprehensive, integrated resource for protein, transcription factor, RNA and gene interactions for Drosophila. Nucleic Acids Res 39:D736–D743
Goodman N, McCormick K, Goldowitz D, Hockly E et al (2003) Plans for HDBase—a research community website for Huntington’s Disease. Clin Neurosci Res 3:197–217
Lechner M, Höhn V, Brauner B, Dunger I et al (2012) CIDeR: multifactorial interaction networks in human diseases. Genome Biol 13:R62
Dinkel H, Chica C, Via A, Gould CM et al (2011) Phospho.ELM: a database of phosphorylation sites–update 2011. Nucleic Acids Res 39:D261–D267
Caspi R, Foerster H, Fulcher CA, Kaipa P et al (2008) The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 36:D623–D631
Smoot ME, Ono K, Ruscheinski J, Wang P-L et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431–432
Brown KR, Otasek D, Ali M, McGuffin MJ et al (2009) NAViGaTOR: Network Analysis, Visualization and Graphing Toronto. Bioinformatics 25:3327–3329
Gene T., Consortium O. (2010) The Gene Ontology Consortium in 2010: extensions and refinements. Nucleic Acids Res 38:D331–D335
Hakes L, Robertson DL, Oliver SG (2005) Effect of dataset selection on the topological interpretation of protein interaction networks. BMC Genomics 6:131
Salwinski L, Miller CS, Smith AJ, Pettit FK et al (2004) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 32:D449–D451
Salwinski L, Eisenberg D (2007) The MiSink Plugin: cytoscape as a graphical interface to the Database of Interacting Proteins. Bioinformatics 23:2193–2195
Deane CM, Salwiński Ł, Xenarios I, Eisenberg D (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 1:349–356
Deng M, Mehta S, Sun F, Chen T (2002) Inferring domain-domain interactions from protein-protein interactions. Genome Res 12:1540–1548
Graeber TG, Eisenberg D (2001) Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles. Nat Genet 29:295–300
Hastings J, de Matos P, Dekker A, Ennis M et al (2013) The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res 41:D456–D463
Ceol A, Chatr Aryamontri A, Licata L, Peluso D et al (2010) MINT, the molecular interaction database: 2009 update. Nucleic Acids Res 38:D532–D539
Persico M, Ceol A, Gavrila C, Hoffmann R et al (2005) HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms. BMC Bioinformatics 6(Suppl 4):S21
Chatr-aryamontri A, Ceol A, Peluso D, Nardozza A et al (2009) VirusMINT: a viral protein interaction database. Nucleic Acids Res 37:D669–D673
Ceol A, Chatr-aryamontri A, Santonico E, Sacco R et al (2007) DOMINO: a database of domain-peptide interactions. Nucleic Acids Res 35:D557–D560
Amberger J, Bocchini C, Hamosh A (2011) A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®). Hum Mutat 32:564–567
Kandasamy K, Mohan SS, Raju R, Keerthikumar S et al (2010) NetPath: a public resource of curated signal transduction pathways. Genome Biol 11:R3
Breuer K, Foroushani AK, Laird MR, Chen C et al (2013) InnateDB: systems biology of innate immunity and beyond–recent updates and continuing curation. Nucleic Acids Res 41:D1228–D1233
Bader GD, Donaldson I, Wolting C, Ouellette BF et al (2001) BIND–The Biomolecular Interaction Network Database. Nucleic Acids Res 29:242–245
Royer L, Reimann M, Andreopoulos B, Schroeder M (2008) Unraveling protein networks with power graph analysis. PLoS Comput Biol 4:e1000108
Barsky A, Gardy JL, Hancock REW, Munzner T (2007) Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation. Bioinformatics 23:1040–1042
Fu W, Sanders-Beer BE, Katz KS, Maglott DR et al (2009) Human immunodeficiency virus type 1, human protein interaction database at NCBI. Nucleic Acids Res 37:D417–D422
Resource Coordinators NCBI (2013) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 41:D8–D20
Chen R, Jeong SS (2000) Functional prediction: identification of protein orthologs and paralogs. Protein Sci 9:2344–2353
Niu Y, Otasek D, Jurisica I (2010) Evaluation of linguistic features useful in extraction of interactions from PubMed; application to annotating known, high-throughput and predicted interactions in I2D. Bioinformatics 26:111–119
Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J et al (2004) The HUPO PSI’s molecular interaction format–a community standard for the representation of protein interaction data. Nat Biotechnol 22:177–183
Hart GT, Ramani AK, Marcotte EM (2006) How complete are current yeast and human protein-interaction networks? Genome Biol 7:120
Burns DM, Horn V, Paluh J, Yanofsky C (1990) Evolution of the tryptophan synthetase of fungi. Analysis of experimentally fused Escherichia coli tryptophan synthetase alpha and beta chains. J Biol Chem 265:2060–2069
Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature 402:86–90
Marcotte EM, Pellegrini M, Ng HL, Rice DW et al (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285:751–753
Dandekar T, Snel B, Huynen M, Bork P (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci 23:324–328
Overbeek R, Fonstein M, D’Souza M, Pusch GD et al (1999) Use of contiguity on the chromosome to predict functional coupling. In Silico Biol 1:93–108
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95:14863–14868
Barrett T, Troup DB, Wilhite SE, Ledoux P et al (2011) NCBI GEO: archive for functional genomics data sets–10 years on. Nucleic Acids Res 39:D1005–D1010
Hirschman L, Park JC, Tsujii J, Wong L et al (2002) Accomplishments and challenges in literature data mining for biology. Bioinformatics 18:1553–1561
Kuhn M, Szklarczyk D, Franceschini A, von Mering C et al (2012) STITCH 3: zooming in on protein-chemical interactions. Nucleic Acids Res 40:D876–D880
Powell S, Szklarczyk D, Trachana K, Roth A et al (2012) eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 40:D284–D289
McDowall MD, Scott MS, Barton GJ (2009) PIPs: human protein-protein interaction prediction database. Nucleic Acids Res 37:D651–D656
Chautard E, Fatoux-Ardore M, Ballut L, Thierry-Mieg N et al (2011) MatrixDB, the extracellular matrix interaction database. Nucleic Acids Res 39:D235–D240
Goll J, Rajagopala SV, Shiau SC, Wu H et al (2008) MPIDB: the microbial protein interaction database. Bioinformatics 24:1743–1744
Lynn DJ, Winsor GL, Chan C, Richard N et al (2008) InnateDB: facilitating systems-level analyses of the mammalian innate immune response. Mol Syst Biol 4:218
The UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39:D214–D219
Aranda B, Blankenburg H, Kerrien S, Brinkman FSL et al (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods 8:528–529
Sambourg L, Thierry-Mieg N (2010) New insights into protein-protein interaction data lead to increased estimates of the S cerevisiae interactome size. BMC Bioinformatics 11:605
Nakayama M, Kikuno R, Ohara O (2002) Protein-protein interactions between large proteins: two-hybrid screening using a functionally classified library composed of long cDNAs. Genome Res 12:1773–1784
Jeong H, Tombor B, Albert R, Oltvai ZN et al (2000) The large-scale organization of metabolic networks. Nature 407:651–654
Wuchty S, Oltvai ZN, Barabási A-L (2003) Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat Genet 35:176–179
Von Mering C, Krause R, Snel B, Cornell M et al (2002) Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417:399–403
Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2:e124
Tang JL (2005) Selection bias in meta-analyses of gene-disease associations. PLoS Med 2:e409
Pál C, Papp B, Hurst LD (2003) Genomic function: rate of evolution and gene dispensability. Nature 421:496–497, discussion 497–8
Bloom JD, Adami C (2003) Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evol Biol 3:21
Brito GC, Andrews DW (2011) Removing bias against membrane proteins in interaction networks. BMC Syst Biol 5:169
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this protocol
Cite this protocol
Szklarczyk, D., Jensen, L.J. (2015). Protein-Protein Interaction Databases. In: Meyerkord, C., Fu, H. (eds) Protein-Protein Interactions. Methods in Molecular Biology, vol 1278. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-2425-7_3
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2425-7_3
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-2424-0
Online ISBN: 978-1-4939-2425-7
eBook Packages: Springer Protocols