Abstract
Association analysis of gene expression traits with genomic features is crucial to identify the molecular mechanisms underlying cancer. In this study, we employ sparse regression methods of Lasso and GFLasso to discover ge-nomic associations. Lasso penalizes a least squares regression by the sum of the absolute values of the coefficients, which in turn leads to sparse solutions. GFLasso, an extension of Lasso, fuses regression coefficients across correlated outcome variables, which is especially suitable for the analysis of gene expres-sion traits having inherent network structure as output traits. Our study is about considering combined benefits of these computational methods and investigat-ing the identified genomic associations. Real genomic datasets from breast can-cer and ovarian cancer patients are analyzed by the proposed approach. We show that the combined effect of both the methods has a significant impact in identifying the crucial cancer causing genomic features with both weaker and stronger associations.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
SEER Stat Fact Sheets: Breast, Ovary National Cancer Institute. http://seer. can cer.gov/statfacts/html/b reast.html
The Cancer Genome Atlas (TCGA). http://www.cancergenome.nih.gov/.
International Cancer Genome Consortium (ICGC). https://icgc.org/icgc
National Human Genome Research Institute. http://www.genome.gov/20019523
Guillaume Lettre and JohnD.Rioux, Autoimmune diseases: insights from genome-wide association studies. Human Molecular Genetics, 2008 , R116-R121.
Dirkje S. Postma, and Gerard H. Koppelman Genetics of Asthma, Proceedings of the American Thoracic Society, Vol. 6, No. 3 (2009), pp. 283-287
McPhersonR, PertsemlidisA, KavaslarN, StewartA, RobertsR, CoxDR, HindsDA, Pennacchio LA, Tybjaerg-Hansen A, Folsom AR, Boerwinkle E, Hobbs HH,Cohen JC (5830). A common allele on chromosome 9 associated with coronary heart disease, 2007 May 3.
National cancer Institute. http://www.cancer.gov/cancertopics/pdq/genetics/breast-and- ovarian/HealthProfessional/page1#Reference1.3
What is the Link between Breast Cancer and Ovarian Cancer? http://www.wndu.com/16buddycheck/headlines/28313989.html
TCGA,http://cancergenome.nih.gov/newsevents/multimedialibrary/videos/Breast0varianMartignetti2014
Robert Tibshirani, Regression Shrinkage and Selection via the Lasso, J.R. Statistics, 1996, pp.(267 - 288)
Seyoung Kim, Kyung-Ah Sohn, Eric P. Xing. A multivariate regression approach to association analysis of a quantitative trait network, ISMB 2009, pages i204-i212.
Catalogue of somatic mutations in cancer. http://cancer.sanger.ac.uk.
Bo Wang, Aziz M Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno,Benjamin Haibe-Kains & Anna Goldenberg.Similarity network fusion for aggregating data types on a genomic scale, Published online 26 January 2014 Nature Methods 11, 333337.
Kyung-Ah Sohn, Dokyoon Kim, Jaehyun Lim and Ju Han Kim. Relative impact of multilayered genomic data on gene expression phenotypes in serous ovarian tumors, BMC Systems Biology 2013.
Seunghak Lee and Eric P. Xing, Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs , ISMB 2012, pages i137-i146.
Noah Simon, Jerome Friedman, Trevor Hastie & Robert Tibshirani A Sparse-Group Lasso, Journal of Computational and Graphical Statistics, 30 May 2013.
Robert Tibshirani, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan , Taylor, Ryan Tibshirani , Strong Rules for Discarding Predictors in Lasso-type Problems Genes-to-Systems Breast Cancer (G2SBC) Database, Departments of Statistics and Health Research and Policy, November 11, 2010.
Genes-to-Systems Breast Cancer (G2SBC) Database http://www.itb.cnr.it/breastcancer/php/G0Tree.php?idG0=G0:0042127
Jacqueline S Biscardi, Rumey C Ishizawar, Corinne M Silva, and Sarah J Parsons, Tyrosine kinase signalling in breast cancer: Epidermal growth factor receptor and c-Src interactions in breast cancer, Breast Cancer Research, Published online Mar 7, 2000 (203 -210).
P.J. Adam, R. Boyd, K.L. Tyson, G.C. Fletcher, A. Stamps, L. Hudson, H.R. Poyser, N. Red- path, M. Griffiths, G. Steers, A.L. Harris, S. Patel, J. Berry, J.A. Loader, R.R. Townsend, L.Daviet, P. Legrain, R. Parekh and J.A. Terrett, Comprehensive proteomic analysis of breast cancer cell membranes reveals unique proteins with potential roles in clinical cancer, The Journal of Biological Chemistry, published online December 10, 2002, 6482-6489.
Cytoscape. http://www.cytoscape.org/cy3.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vangimalla, R.R., Sohn, KA. (2015). Discovering genomic associations on cancer datasets by applying sparse regression methods. In: Kim, K. (eds) Information Science and Applications. Lecture Notes in Electrical Engineering, vol 339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46578-3_84
Download citation
DOI: https://doi.org/10.1007/978-3-662-46578-3_84
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46577-6
Online ISBN: 978-3-662-46578-3
eBook Packages: EngineeringEngineering (R0)