Abstract
EasyCluster is a well-established python software appropriately developed to produce reliable clusters by expressed sequence tags (EST) in order to infer and improve gene structures as well as discover potential alternative splicing events. In the present work we present EasyCluster2, a reimplementation of EasyCluster in Java programming language, able to manage genome scale transcriptome data produced by Roche 454 sequencers. EasyCluster2 has been developed to speed up the creation of gene-oriented clusters and facilitate downstream analyses as the assembly of full-length transcripts. In addition, EasyCluster2 can employ known annotations to refine the overall clustering procedure, embeds the AStalavista software to predict the impact of alternative splicing per cluster and provides output files in specific formats to be uploaded in the UCSC genome browser for an easy browsing of results. Thanks to the user-friendly interface, EasyCluster2 simplifies the interpretation of findings to researchers with no specific skills in bioinformatics. Easycluster2 executable is freely available at https://code.google.com/p/easycluster2/.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Nagaraj, S.H., Gasser, R.B., Ranganathan, S.: A hitchhiker’s guide to expressed sequence tag (EST) analysis. Briefings in Bioinformatics 8, 6–21 (2007)
Picardi, E., Mignone, F., Pesole, G.: EasyCluster: a fast and efficient gene-oriented clustering tool for large-scale transcriptome data. BMC Bioinformatics 10, S10 (2009)
Picardi, E., Bevilacqua, V., Stroppa, F., Pesole, G.: An improved procedure for clustering and assembly of large transcriptome data. EMBnet. journal (2012)
Bevilacqua, V., Stroppa, F., Saladino, S., Picardi, E.: A novel approach to clustering and assembly of large-scale roche 454 transcriptome data for gene validation and alternative splicing analysis. In: Huang, D.-S., Gan, Y., Premaratne, P., Han, K. (eds.) ICIC 2011. LNCS, vol. 6840, pp. 641–648. Springer, Heidelberg (2012)
Droege, M., Hill, B.: The Genome Sequencer FLX System–longer reads, more applications, straightforward bioinformatics and more complete data sets. J. Biotechnol. 31, 136(1-2), 3–10 (2008)
Wu, T.D., Watanabe, C.K.: GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005)
Foissac, S., Sammeth, M.: ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets. Nucleic Acids Res 35, W297–W299 (2007)
Lysholm, F., Andersson, B., Persson, B.: An efficient simulator of 454 data using configurable statistical models. BMC Res Notes 4(1), 449 (2011)
Moustafa, A.: JAligner: Open source Java implementation of Smith-Waterman., http://jaligner.sourceforge.net (the date accessed)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bevilacqua, V., Pietroleonardo, N., Giannino, E.I., Stroppa, F., Pesole, G., Picardi, E. (2013). Clustering and Assembling Large Transcriptome Datasets by EasyCluster2. In: Huang, DS., Gupta, P., Wang, L., Gromiha, M. (eds) Emerging Intelligent Computing Technology and Applications. ICIC 2013. Communications in Computer and Information Science, vol 375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39678-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-39678-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39677-9
Online ISBN: 978-3-642-39678-6
eBook Packages: Computer ScienceComputer Science (R0)