Abstract
Classification of prokaryotes is mainly based on molecular data, since next-generation sequencing platforms provide fast and effective way to capture prokaryotes’ characteristics. However, two different bacterial strains of the same genus can differ in the specific parts of their genomes due to copious amounts of repetitive and transposable parts. Thus, finding an ideal segment of genome for comparison is difficult. Conventional character-based methods rely on multiple sequence alignment, rendering them extremely computationally demanding. Only small parts of genomes can be compared in reasonable time. In this paper, we present a novel algorithm based on the conversion of the whole genome sequences to cumulative phase signals. Dyadic wavelet transform (DWT) is used for lossy compression of phase signals by eliminating redundant frequency bands. Signal classification is then performed as cluster analysis using Euclidean metrics where sequence alignment is replaced by dynamic time warping (DTW).
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Mayr, E., Bock, W.J.: Classifications and other ordering systems. Zool. Syst. Evol. Research 40, 169–194 (2002)
Cohen, A., Daubechies, I., Vial, P.: Wavelets on the Interval and Fast Wavelet Transforms. Applied and Computational Harmonic Analysis 1(1), 54–81 (1992)
Skutkova, H., Vitek, M., Babula, P., Kizek, R., Provaznik, I.: Classification of genomic signals using dynamic time warping. BMC Bioinformatics 14, S1 (2013)
Bittner, L., Halary, S., Payri, C., Cruaud, C., de Reviers, B., Lopez, P., Bapteste, E.: Some considerations for analyzing biodiversity using integrative metagenomics and gene networks. Biology Direct 5 (2010)
Chapple, D.G., Ritchie, P.A.: A Retrospective Approach to Testing the DNA Barcoding Method. PloS One 8(11) (2013)
Anastassiou, D.: Genomic Signal Processing. IEEE Signal Processing Magazine 18(4), 8–20 (2001)
Cristea, P.D.: Conversion of nucleotides sequences into genomic signals. Journal of Cellular and Molecular Medicine 6(2), 279–303 (2002)
Yau, S.S.T., Wang, J.S., Niknejad, A., Lu, C., Jin, N., Ho, Y.K.: DNA sequence representation without degeneracy. Nucleic Acids Research 31(12), 3078–3080 (2003)
Cristea, P.D.: Large scale features in DNA genomic signals. Signal Processing 83, 871–888 (2003)
Hao, W., Golding, G.B.: Patterns of Bacterial Gene Movement. Mol. Biol. Evol. 21(7), 1294–1307 (2004)
Sorimachi, K.: A Proposed Solution to the Historic Puzzle of Chargaff’s Second Parity Rule. The Open Genomics Journal 2(1), 12–14 (2009)
Jan, J.: Digital signal filtering, analysis and restoration. Institution of Electrical Engineers (2000)
Daubechies, I.: Ten lectures on wavelets. CBMS-NSF conference series in applied mathematics. SIAM Ed (1992)
Berndt, D., Clifford, J.:Using dynamic time warping to find patterns in time series, New York, vol. 398, pp. 359–370 (1994)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)
Sokal, R., Michener, C.: A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 38, 1409–1438 (1958)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Sedlar, K., Skutkova, H., Vitek, M., Provaznik, I. (2014). Prokaryotic DNA Signal Downsampling for Fast Whole Genome Comparison. In: Piętka, E., Kawa, J., Wieclawek, W. (eds) Information Technologies in Biomedicine, Volume 3. Advances in Intelligent Systems and Computing, vol 283. Springer, Cham. https://doi.org/10.1007/978-3-319-06593-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-06593-9_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06592-2
Online ISBN: 978-3-319-06593-9
eBook Packages: EngineeringEngineering (R0)