Abstract
Identifying gene functional modules is an important step towards elucidating gene functions at a global scale. In this paper, we introduce a simple method to construct gene co-expression networks from microarray data, and then propose an efficient spectral clustering algorithm to identify natural communities, which are relatively densely connected sub-graphs, in the network. To assess the effectiveness of our approach and its advantage over existing methods, we develop a novel method to measure the agreement between the gene communities and the modular structures in other reference networks, including protein-protein interaction networks, transcriptional regulatory networks, and gene networks derived from gene annotations. We evaluate the proposed methods on two large-scale gene expression data in budding yeast and Arabidopsis thaliana. The results show that the clusters identified by our method are functionally more coherent than the clusters from several standard clustering algorithms, such as k-means, self-organizing maps, and spectral clustering, and have high agreement to the modular structures in the reference networks.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Tong, A., Drees, B., Nardelli, G., Bader, G., Brannetti, B., Castagnoli, L., Evangelista, M., Ferracuti, S., Nelson, B., Paoluzi, S., Quondam, M., Zucconi, A., Hogue, C., Fields, S., Boone, C., Cesareni, G.: A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295, 321–324 (2002)
Stuart, J., Segal, E., Koller, D., Kim, S.: A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003)
Jeong, H., Tombor, B., Albert, R., Oltvai, Z., Barabasi, A.: The large-scale organization of metabolic networks. Nature 407, 651–654 (2000)
Lee, T., Rinaldi, N., Robert, F., Odom, D., Bar-Joseph, Z., Gerber, G., Hannett, N., Harbison, C., Thompson, C., Simon, I., Zeitlinger, J., Jennings, E., Murray, H., Gordon, D., Ren, B., Wyrick, J., Tagne, J., Volkert, T., Fraenkel, E., Gifford, D., Young, R.: Transcriptional regulatory networks in saccharomyces cerevisiae. Science 298, 799–804 (2002)
Jeong, H., Mason, S., Barabasi, A., Oltvai, Z.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)
Ravasz, E., Somera, A., Mongru, D., Oltvai, Z., Barabasi, A.: Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002)
Barabasi, A., Oltvai, Z.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet 5, 101–113 (2004)
Oltvai, Z., Barabasi, A.: Systems biology. life’s complexity pyramid. Science 298, 763–764 (2002)
Armstrong, N., van de Wiel, M.: Microarray data analysis: from hypotheses to conclusions using gene expression data. Cell Oncol. 26, 279–290 (2004)
Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998)
Tavazoie, S., Hughes, J., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E., Golub, T.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999)
Qian, J., Dolled-Filhart, M., Lin, J., Yu, H., Gerstein, M.: Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J. Mol. Bio. 314, 1053–1066 (2001)
Bolshakova, N., Azuaje, F.: Machaon CVE: cluster validation for gene expression data. Bioinformatics 19, 2494–2495 (2003)
Azuaje, F., Al-Shahrour, F., Dopazo, J.: Ontology-driven approaches to analyzing data in functional genomics. Methods Mol. Biol. 316, 67–86 (2006)
Gibbons, F., Roth, F.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12, 1574–1581 (2002)
Ruan, J., Zhang, W.: Identification and evaluation of weak community structures in networks. In: Proc. National Conf. on AI, (AAAI-06), pp. 470–475 (2006)
Ruan, J., Zhang, W.: Discovering weak community structures in large biological networks. Technical Report cse-2006-20, Washington University in St Louis (2006)
Stark, C., Breitkreutz, B., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: Biogrid: a general repository for interaction datasets. Nucleic Acids Res. 34, D535–539 (2006)
The Gene Ontology Consortium: The gene ontology (GO) database and informatics resource. Nucleic Acids Res. 32 (2004)
Harbison, C., Gordon, D., Lee, T., Rinaldi, N., Macisaac, K., Danford, T., Hannett, N., Tagne, J., Reynolds, D., Yoo, J., Jennings, E., Zeitlinger, J., Pokholok, D., Kellis, M., Rolfe, P., Takusagawa, K., Lander, E., Gifford, D., Fraenkel, E., Young, R.: Transcriptional regulatory code of a eukaryotic genome. Nature. 431, 99–104 (2004)
Zhou, X., Kao, M., Wong, W.: Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl. Acad. Sci. USA 99, 12783–12788 (2002)
Carter, S., Brechbuhler, C., Griffin, M., Bond, A.: Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics 20, 2242–2250 (2004)
Zhu, D., Hero, A., Cheng, H., Khanna, R., Swaroop, A.: Network constrained clustering for gene microarray data. Bioinformatics 21, 4014–4020 (2005)
Aggarwal, A., Guo, D., Hoshida, Y., Yuen, S., Chu, K., So, S., Boussioutas, A., Chen, X., Bowtell, D., Aburatani, H., Leung, S., Tan, P.: Topological and functional discovery in a gene coexpression meta-network of gastric cancer. Cancer Res. 66, 232–241 (2006)
Fjallstrom, P.: Algorithms for graph partitioning: A survey. Linkoping Electron. Atricles in Comput. and Inform. Sci. (1998)
Newman, M., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E Stat Nonlin Soft Matter Phys. 69, 26113 (2004)
Newman, M.: The structure and function of complex networks. SIAM Review 45, 167–256 (2003)
Danon, L., Duch, J., Diaz-Guilera, A., Arenas, A.: Comparing community structure identification. J. Stat. Mech, p. P09008 (2005)
Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman, San Francisco (1979)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: NIPS. pp. 849–856 (2001)
White, S., Smyth, P.: A spectral clustering approach to finding communities in graph. In: SIAM Data Mining (2005)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)
Altman, D.: Practical Statistics for Medical Research. Chapman & Hall/CRC (1991)
Boyle, E., Weng, S., Gollub, J., Jin, H., Botstein, D., Cherry, J., Sherlock, G.: Go:termfinder - open source software for accessing gene ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics 20, 3710–3715 (2004)
Jones, K.S.: Idf term weighting and ir research lessons. Journal of Documentation 60, 521–523 (2004)
Gasch, A., Spellman, P., Kao, C., Carmel-Harel, O., Eisen, M., Storz, G., Botstein, D., Brown, P.: Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell 11, 4241–4257 (2000)
Albert, R., Barabasi, A.: Statistical mechanics of complex networks. Reviews of Modern Physics 74, 47 (2002)
Saccharomyces genome database, http://www.yeastgenome.org/
Wallace, D.L.: Comment. Journal of the American Statistical Assocation 78, 569–576 (1983)
Friedman, N., Linial, M., Nachman, I., Peer, D.: Using bayesian networks to analyze expression data. J. Comput Biol. 7, 601–620 (2000)
Kauffman, S.: A proposal for using the ensemble approach to understand genetic regulatory networks. J. Theor Biol. 230, 581–590 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruan, J., Zhang, W. (2007). Identification and Evaluation of Functional Modules in Gene Co-expression Networks. In: Ideker, T., Bafna, V. (eds) Systems Biology and Computational Proteomics. RSB RCP 2006 2006. Lecture Notes in Computer Science(), vol 4532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73060-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-73060-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73059-0
Online ISBN: 978-3-540-73060-6
eBook Packages: Computer ScienceComputer Science (R0)