Abstract
High-throughput technologies such as microarray data are a great resource for studying and understanding biological systems at a low cost. However noise present in the data makes it less reliable, and thus many computational methods and algorithms have been developed for removing the noise. We propose a novel noise removal algorithm based on Fourier transform functions. The algorithm optimizes the coefficients of the first and second order Fourier functions and selects the function which maximizes the Spearman correlation to the original data. To demonstrate the performance of this algorithm we compare the prediction accuracy of well known modelling tools, such as network component analysis (NCA), principal component analysis (PCA) and k-means clustering. We compared the performance of these tools on the original noisy data and the data treated with the algorithm. We performed the comparison analysis using three independent real biological data sets (each data set with two replicates). In all cases the proposed algorithm removes the noise in the data and substantially improves the predictions of modelling tools.
An Erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-642-40669-0_46
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Bar-Joseph, Z., Gitter, A., Simon, I.: Studying and modelling dynamic biological processes using time-series gene expression data. Nat. Rev. Genet. 13(8), 552–564 (2012)
D’haeseleer, P.: How does gene expression clustering work. Nat. Biotech. 23(12), 1499–1501 (2005), http://dx.doi.org/10.1038/nbt1205-1499
Guo, Y.S., Cheng, J.Z., Jin, G.F., Gutkind, J.S., Hellmich, M.R., Townsend, C.M.: Gastrin stimulates cyclooxygenase-2 expression in intestinal epithelial cells through multiple signaling pathways: evidence for involvement of erk5 kinase transactivation of the epidermal growth factor. Journal of Biological Chemistry 277(50), 48755–48763 (2002)
Hardin, J., Wilson, J.: A note on oligonucleotide expression values not being normally distributed. Biostatistics 10(3), 446–450 (2009), http://biostatistics.oxfordjournals.org/content/10/3/446.abstract
Lewin, A., Bochkina, N., Richardson, S.: Fully bayesian mixture model for differential gene expression: simulations and model checks. Statistical Applications in Genetics and Molecular Biology 6 (2007)
Liao, J.C., Boscolo, R., Yang, Y.L., Tran, L.M., Sabatti, C., Roychowdhury, V.P.: Network component analysis: reconstruction of regulatory signals in biological systems. Proc. Natl. Acad. Sci. U S A 100(26), 15522–15527 (2003)
Posekany, A., Felsenstein, K., Sykacek, P.: Biological assessment of robust noise models in microarray data analysis. Bioinformatics (2011), http://bioinformatics.oxfordjournals.org/content/early/2011/01/19/bioinformatics.btr018.abstract
Raychaudhuri, S., Stuart, J.M., Altman, R.B.: Principal components analysis to summarize microarray experiments: Application to sporulation time series. In: Pac. Symp. Biocomput., pp. 452–463 (2000)
Ringner, M.: What is principal component analysis. Nat. Biotech. 26(3), 303–304 (2008), http://dx.doi.org/10.1038/nbt0308-303
Rousseeuw, P.J.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65 (1987), http://www.sciencedirect.com/science/article/pii/0377042787901257
Sloutsky, R., Jimenez, N., Swamidass, S.J., Naegle, K.M.: Accounting for noise when clustering biological data. Brief Bioinform. (October 2012), http://dx.doi.org/10.1093/bib/bbs057
Subramaniam, D., Ramalingam, S., May, R., Dieckgraefe, B.K., Berg, D.E., Pothoulakis, C., Houchen, C.W., Wang, T.C., Anant, S.: Gastrin-mediated interleukin-8 and cyclooxygenase-2 gene expression: Differential transcriptional and posttranscriptional mechanisms. Gastroenterology 134(4), 1070–1082 (2008)
Tang, V., Yan, H.: Noise reduction in microarray gene expression data based on spectral analysis. International Journal of Machine Learning and Cybernetics 3, 51–57 (2012), http://dx.doi.org/10.1007/s13042-011-0039-7
Tu, Y., Stolovitzky, G., Klein, U.: Quantitative noise analysis for gene expression microarray experiments. Proceedings of the National Academy of Sciences 99(22), 14031–14036 (2002), http://www.pnas.org/content/99/22/14031.abstract
Warren Liao, T.: Clustering of time series data-a survey. Pattern Recogn. 38(11), 1857–1874 (2005), http://dx.doi.org/10.1016/j.patcog.2005.01.025
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Doni Jayavelu, N., Bar, N. (2013). A Noise Removal Algorithm for Time Series Microarray Data. In: Correia, L., Reis, L.P., Cascalho, J. (eds) Progress in Artificial Intelligence. EPIA 2013. Lecture Notes in Computer Science(), vol 8154. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40669-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-40669-0_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40668-3
Online ISBN: 978-3-642-40669-0
eBook Packages: Computer ScienceComputer Science (R0)