Biomine: A Network-Structured Resource of Biological Entities for Link Prediction

Eronen, Lauri; Hintsanen, Petteri; Toivonen, Hannu

doi:10.1007/978-3-642-31830-6_26

Lauri Eronen⁵,
Petteri Hintsanen⁵ &
Hannu Toivonen⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7250))

9194 Accesses
4 Citations
3 Altmetric

Abstract

Biomine is a biological graph database constructed from public databases. Its entities (vertices) include biological concepts (such as genes, proteins, tissues, processes and phenotypes, as well as scientific articles) and relations (edges) between these entities correspond to real-world phenomena such as “a gene codes for a protein” or “an article refers to a phenotype”. Biomine also provides tools for querying the graph for connections and visualizing them interactively.

We describe the Biomine graph database. We also discuss link discovery in such biological graphs and review possible link prediction measures. Biomine currently contains over 1 million entities and over 8 million relations between them, with focus on human genetics. It is available on-line and can be queried for connecting subgraphs between biological entities.

Download to read the full chapter text

Chapter PDF

Biological Network Mining

Mango: combining and analyzing heterogeneous biological networks

Article Open access 02 August 2016

PhenUMA: a tool for integrating the biomedical relationships among genes and diseases

Article Open access 25 November 2014

References

Kötter, T., Berthold, M.R.: From Information Networks to Bisociative Information Networks. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 33–50. Springer, Heidelberg (2012)
Chapter Google Scholar
Dubitzky, W., Kötter, T., Schmidt, O., Berthold, M.R.: Towards Creative Information Exploration Based on Koestler’s Concept of Bisociation. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 11–32. Springer, Heidelberg (2012)
Chapter Google Scholar
Sevon, P., Eronen, L., Hintsanen, P., Kulovesi, K., Toivonen, H.: Link discovery in graphs derived from biological databases. In: Proceedings of Data Integration in the Life Sciences, Third International Workshop, pp. 35–49 (2006)
Google Scholar
Getoor, L., Diehl, C.P.: Link mining: A survey. SIGKDD Explorations 7, 3–12 (2005)
Article Google Scholar
Maglott, D., Ostell, J., Pruitt, K.D., Tatusova, T.: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Research 35, D26–D31 (2007)
Article Google Scholar
Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Federhen, S., Feolo, M., Geer, L.Y., Helmberg, W., Kapustin, Y., Landsman, D., Lipman, D.J., Lu, Z., Madden, T.L., Madej, T., Maglott, D.R., Marchler-Bauer, A., Miller, V., Mizrachi, I., Ostell, J., Panchenko, A., Pruitt, K.D., Schuler, G.D., Sequeira, E., Sherry, S.T., Shumway, M., Sirotkin, K., Slotta, D., Souvorov, A., Starchenko, G., Tatusova, T.A., Wagner, L., Wang, Y., Wilbur, W.J., Yaschenko, E., Ye, J.: Database resources of the National Center for Biotechnology Information. Nucleic Acids Research 38, 5–16 (2010)
Article Google Scholar
The Uniprot Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38, D142–D148 (2010)
Google Scholar
Hunter, S., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bork, P., Das, U., Daugherty, L., Duquenne, L., Finn, R.D., Gough, J., Haft, D., Hulo, N., Kahn, D., Kelly, E., Laugraud, A., Letunic, I., Lonsdale, D., Lopez, R., Madera, M., Maslen, J., McAnulla, C., McDowall, J., Mistry, J., Mitchell, A., Mulder, N., Natale, D., Orengo, C., Quinn, A.F., Selengut, J.D., Sigrist, C.J.A., Thimma, M., Thomas, P.D., Valentin, F., Wilson, D., Wu, C.H., Yeats, C.: InterPro: the integrative protein signature database. Nucleic Acids Research 37, D211–D215 (2009)
Article Google Scholar
Jensen, L.J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A., Simonovic, M., Bork, P., von Mering, C.: STRING 8—a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Research 37, D412–D416 (2009)
Article Google Scholar
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
Article Google Scholar
Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., Hirakawa, M.: KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Research 38, D355–D360 (January 2010)
Article Google Scholar
Gerhard, D.S., et al.: The status, quality, and expansion of the NIH full-length cDNA project: The Mammalian Gene Collection (MGC). Genome Research 14, 2121–2127 (2004), full list of authors http://dx.doi.org/10.1101/gr.2596504
Article Google Scholar
Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. Journal of the American Society for Information Science and Technology 58(7), 1019–1031 (2007)
Article Google Scholar
Newman, M.E.J.: Clustering and preferential attachment in growing networks. Physical Review E 64(2), 025102 (2001)
Article Google Scholar
Adamic, L.A., Adar, E.: Friends and neighbors on the Web. Social Networks 25(3), 211–230 (2003)
Article Google Scholar
Faloutsos, C., McCurley, K.S., Tomkins, A.: Fast discovery of connection subgraphs. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 118–127 (2004)
Google Scholar
Koren, Y., North, S.C., Volinsky, C.: Measuring and extracting proximity graphs in networks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–255 (2006)
Google Scholar
Doyle, P.G., Snell, J.L.: Random walks and electric networks (January 2000), http://arxiv.org/abs/math.PR/0001057
Brandes, U., Fleischer, D.: Centrality Measures Based on Current Flow. In: Diekert, V., Durand, B. (eds.) STACS 2005. LNCS, vol. 3404, pp. 533–544. Springer, Heidelberg (2005)
Chapter Google Scholar
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Article MathSciNet Google Scholar
Chen, H., Zhang, F.: The expected hitting times for finite Markov chains. Linear Algebra and its Applications 428(11-12), 2730–2749 (2008)
Article MathSciNet Google Scholar
Jeh, G., Widom, J.: SimRank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, pp. 538–543. ACM (July 2002)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Article Google Scholar
Colbourn, C.J.: The Combinatorics of Network Reliability. Oxford University Press (1987)
Google Scholar
Köhler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Rüegg, A., Rawlings, C., Verrier, P., Philippi, S.: Graph-based analysis and visualization of experimental results with ONDEX. Bioinformatics 22(11), 1383–1390 (2006)
Article Google Scholar
Birkland, A., Yona, G.: BIOZON: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinformatics 7(1), 70 (2006)
Article Google Scholar
Lee, T., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D., Tenenbaum, J., Karp, P.: BioWarehouse: a bioinformatics database warehouse toolkit. BMC Bioinformatics 7(1), 170 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and HIIT, University of Helsinki, Finland
Lauri Eronen, Petteri Hintsanen & Hannu Toivonen

Authors

Lauri Eronen
View author publications
You can also search for this author in PubMed Google Scholar
Petteri Hintsanen
View author publications
You can also search for this author in PubMed Google Scholar
Hannu Toivonen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, University of Konstanz, Konstanz, Germany
Michael R. Berthold

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Eronen, L., Hintsanen, P., Toivonen, H. (2012). Biomine: A Network-Structured Resource of Biological Entities for Link Prediction. In: Berthold, M.R. (eds) Bisociative Knowledge Discovery. Lecture Notes in Computer Science(), vol 7250. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31830-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-31830-6_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31829-0
Online ISBN: 978-3-642-31830-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Biomine: A Network-Structured Resource of Biological Entities for Link Prediction

Abstract

Chapter PDF

Similar content being viewed by others

Biological Network Mining

Mango: combining and analyzing heterogeneous biological networks

PhenUMA: a tool for integrating the biomedical relationships among genes and diseases

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Biomine: A Network-Structured Resource of Biological Entities for Link Prediction

Abstract

Chapter PDF

Similar content being viewed by others

Biological Network Mining

Mango: combining and analyzing heterogeneous biological networks

PhenUMA: a tool for integrating the biomedical relationships among genes and diseases

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation