Abstract
The abundance of publicly available life science databases offers a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and we discuss how the available tools are best utilized for different purposes. While emphasizing protein–protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug–protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol for building customized protein–protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Szklarczyk D et al (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39(Database issue):D561–D568
Mostafavi S et al (2008) GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol 9(Suppl 1):S4
Dempsey K et al (2012) Functional identification in correlation networks using gene ontology edge annotation. Int J Comput Biol Drug Des 5(3–4):222–244
Smoot ME et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27(3):431–432
Hu Z et al (2009) VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology. Nucleic Acids Res 37(Web Server issue):W115–W121
Theocharidis A et al (2009) Network visualization and analysis of gene expression data using BioLayout Express(3D). Nat Protoc 4(10):1535–1550
Mellor JC et al (2002) Predictome: a database of putative functional links between proteins. Nucleic Acids Res 30(1):306–309
Cline MS et al (2007) Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2(10):2366–2382
Shannon P et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504
Lehne B, Schlitt T (2009) Protein–protein interaction databases: keeping up with growing interactomes. Hum Genomics 3(3):291–297
Snel B et al (2000) STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res 28(18):3442–3444
Pena-Castillo L et al (2008) A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 9(Suppl 1):S2
Warde-Farley D et al (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38(Web Server issue):W214–W220
Apweiler R et al (2010) A large-scale protein-function database. Nat Chem Biol 6(11):785
Kiefer F et al (2009) The SWISS-MODEL repository and associated resources. Nucleic Acids Res 37(Database issue):D387–D392
Letunic I, Doerks T, Bork P (2009) SMART 6: recent updates and new developments. Nucleic Acids Res 37(Database issue):D229–D232
Heldin C-H, Miyazono K, ten Dijke P (1997) TGF-(beta) signalling from cell membrane to nucleus through SMAD proteins. Nature 390(6659):465–471
Montojo J et al (2010) GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26(22):2927–2928
Barrett T et al (2011) NCBI GEO: archive for functional genomics data sets—10 years on. Nucleic Acids Res 39(Database issue):D1005–D1010
Stark C et al (2011) The BioGRID Interaction Database: 2011 update. Nucleic Acids Res 39(Database issue):D698–D704
Cerami EG et al (2011) Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res 39(Database issue):D685–D690
Brown KR, Jurisica I (2005) Online predicted human interaction database. Bioinformatics 21(9):2076–2082
Alibes A et al (2007) IDconverter and IDClight: conversion and annotation of gene and protein IDs. BMC Bioinformatics 8:9
Mudunuri U et al (2009) bioDBnet: the biological database network. Bioinformatics 25(4):555–556
Razick S et al (2011) iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex. BMC Bioinformatics 12:388
Kerrien S et al (2012) The IntAct molecular interaction database in 2012. Nucleic Acids Res 40(Database issue):D841–D846
Alfarano C et al (2005) The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 33(Database issue):D418–D424
Licata L et al (2012) MINT, the Molecular Interaction Database: 2012 update. Nucleic Acids Res 40(Database issue):D857–D861
Astsaturov I et al (2010) Synthetic lethal screen of an EGFR-centered network to improve targeted therapies. Sci Signal 3(140):ra67
Kanehisa M et al (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32(Database issue):D277–D280
Croft D et al (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39(Database issue):D691–D697
Joshi-Tope G et al (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 33(Database issue):D428–D432
Hernandez-Boussard T et al (2008) The pharmacogenetics and pharmacogenomics knowledge base: accentuating the knowledge. Nucleic Acids Res 36(Database issue):D913–D918
Kandasamy K et al (2010) NetPath: a public resource of curated signal transduction pathways. Genome Biol 11(1):R3
Schaefer CF et al (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37(Database issue):D674–D679
Kelder T et al (2012) WikiPathways: building research communities on biological pathways. Nucleic Acids Res 40(Database issue):D1301–D1307
Barrett T et al (2007) NCBI GEO: mining tens of millions of expression profiles–database and tools update. Nucleic Acids Res 35(Database issue):D760–D765
Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207–210
Brazma A et al (2003) ArrayExpress – a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 31(1):68–71
Parkinson H et al (2007) ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 35(Database issue):D747–D750
Parkinson H et al (2011) ArrayExpress update – an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res 39(Database issue):D1002–D1004
Hibbs MA et al (2007) Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23(20):2692–2699
Lamb J et al (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935
Yildirim MA et al (2007) Drug-target network. Nat Biotechnol 25(10):1119–1126
Kuhn M et al (2008) STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res 36(Database issue):D684–D688
Wang Y et al (2012) PubChem’s BioAssay Database. Nucleic Acids Res 40(Database issue):D400–D412
Cohen KB, Hunter L (2008) Getting started in text mining. PLoS Comput Biol 4(1):e20
Hoffmann R, Valencia A (2005) Implementing the iHOP concept for navigation of biomedical literature. Bioinformatics 21(Suppl 2):ii252–ii258
Plotnikova OV et al (2012) Calmodulin activation of Aurora-A kinase (AURKA) is required during ciliary disassembly and in mitosis. Mol Biol Cell 23(14):2658–2670
Orchard S (2012) Molecular interaction databases. Proteomics 12(10):1656–1662
Latendresse M, Paley S, Karp PD (2012) Browsing metabolic and regulatory networks with BioCyc. Methods Mol Biol 804:197–216
Keseler IM et al (2011) EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Res 39(Database issue):D583–D590
Mathivanan S et al (2008) Human Proteinpedia enables sharing of human protein data. Nat Biotechnol 26(2):164–167
Kanehisa M et al (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34(Database issue):D354–D357
Ruepp A et al (2010) CORUM: the comprehensive resource of mammalian protein complexes–2009. Nucleic Acids Res 38(Database issue):D497–D501
Salwinski L et al (2004) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 32(Database issue):D449–D451
Guldener U et al (2006) MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res 34(Database issue):D436–D441
Pagel P et al (2005) The MIPS mammalian protein–protein interaction database. Bioinformatics 21(6):832–834
Brown KR, Jurisica I (2007) Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biol 8(5):R95
Jayapandian M et al (2007) Michigan molecular interactions (MiMI): putting the jigsaw puzzle together. Nucleic Acids Res 35(Database issue):D566–D571
Stelzl U et al (2005) A human protein–protein interaction network: a resource for annotating the proteome. Cell 122(6):957–968
Han JD et al (2004) Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 430(6995):88–93
Hunter S et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40(Database issue):D306–D312
Kim YJ et al (2005) miBLAST: scalable evaluation of a batch of nucleotide sequence queries with BLAST. Nucleic Acids Res 33(13):4335–4344
Wiwatwattana N et al (2007) Organelle DB: an updated resource of eukaryotic protein localization and function. Nucleic Acids Res 35(Database issue):D810–D814
Fischer S et al (2011) Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr Protoc Bioinformatics Chapter 6: Unit 6 12 1–19
Punta M et al (2012) The Pfam protein families database. Nucleic Acids Res 40(Database issue):D290–D301
Rappoport N et al (2012) ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree. Nucleic Acids Res 40(Database issue):D313–D320
Adams MD et al (2000) The genome sequence of Drosophila melanogaster. Science 287(5461):2185–2195
Walhout AJ, Vidal M (2001) Protein interaction maps for model organisms. Nat Rev Mol Cell Biol 2(1):55–62
Echeverria PC et al (2011) An interaction network predicted from public data as a discovery tool: application to the Hsp90 molecular chaperone machine. PLoS One 6(10):e26044
Sharan R et al (2005) Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. J Comput Biol 12(6):835–846
Ulitsky I, Shamir R (2007) Pathway redundancy and protein essentiality revealed in the Saccharomyces cerevisiae interaction networks. Mol Syst Biol 3:104
Murali T et al (2011) DroID 2011: a comprehensive, integrated resource for protein, transcription factor, RNA and gene interactions for Drosophila. Nucleic Acids Res 39(Database issue):D736–D743
Yu J et al (2008) DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions. BMC Genomics 9:461
McQuilton P, St Pierre SE, Thurmond J (2012) FlyBase 101 – the basics of navigating FlyBase. Nucleic Acids Res 40(Database issue):D706–D714
Pacifico S et al (2006) A database and tool, IM Browser, for exploring and integrating emerging gene and protein interaction data for Drosophila. BMC Bioinformatics 7:195
Cherry JM et al (2012) Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40(Database issue):D700–D705
Stein L et al (2001) WormBase: network access to the genome and biology of Caenorhabditis elegans. Nucleic Acids Res 29(1):82–86
Harris TW et al (2010) WormBase: a comprehensive resource for nematode research. Nucleic Acids Res 38(Database issue):D463–D467
Guan Y et al (2008) A genomewide functional network for the laboratory mouse. PLoS Comput Biol 4(9):e1000165
ten Dijke P, Arthur HM (2007) Extracellular control of TGFbeta signalling in vascular development and disease. Nat Rev Mol Cell Biol 8(11):857–869
Acknowledgments
The authors were supported by U54 CA149147, R01 CA63366, and P50 CA083638 from the NIH (to EAG), postdoctoral fellowship from SASS Foundation for Medical Research and Ann Schreiber Program of Excellence Grant from the Ovarian Cancer Research Fund (to HL), Drexel University College of Medicine MD-PhD Program (to TNB), and NIH core grant CA06927 (to Fox Chase Cancer Center).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Liu, H., Beck, T.N., Golemis, E.A., Serebriiskii, I.G. (2014). Integrating In Silico Resources to Map a Signaling Network. In: Ochs, M. (eds) Gene Function Analysis. Methods in Molecular Biology, vol 1101. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-721-1_11
Download citation
DOI: https://doi.org/10.1007/978-1-62703-721-1_11
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-720-4
Online ISBN: 978-1-62703-721-1
eBook Packages: Springer Protocols