Skip to main content

A Parallel Expressed Sequence Tag (EST) Clustering Program

  • Conference paper
  • First Online:
Parallel Computing Technologies (PaCT 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2127))

Included in the following conference series:

Abstract

This paper describes the UIcluster software tool, which partitions Expressed Sequence Tag (EST) sequences and other genetic sequences into “clusters” based on sequence similarity. Ideally, each cluster will contain sequences that all represent the same gene. If a naýve approach such as an NxN comparison (N is the number of sequences input) is taken, the problem is only feasible for very small data sets. UIcluster has been developed over the course of four years to solve this problem efficiently and accurately for large data sets consisting of tens or hundreds of thousands of EST sequences. The latest version of the application has been parallelized using the MPI (message passing interface) standard. Both the computation and memory requirements of the program can be distributed among multiple (possibly distributed) UNIX processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Adams M.D., Kerlavage A.R., Fleishmann R.D., Fuldner R.A., Bult C.J., Lee N.H., Kirkness E.F., Weinstock K.G., Gocayne J.D., White O., et al. (1995) Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377:3–17

    Google Scholar 

  2. Bonaldo M.F., Lennon G., Soares M.B. (1996) Normalization and subtraction: two approaches to facilitate gene discovery. Genome Research 6:791–806

    Article  Google Scholar 

  3. International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

    Article  Google Scholar 

  4. Message Passing Interface Form (1994) MPI: A message-passing interface standard. University of Tennessee Technical Report CS-94-230

    Google Scholar 

  5. Miller R.T., Christoffels A.G., Gopalakrishnan C., Burke J.A., Ptitsyn A.A., Broveak T.R., Hide W.A. (1999) A comprehensive approach to clustering of expressed human gene sequence: The Sequence Tag Alighment and Consensus Knowledgebase. Genome Research 9:1143–1155

    Article  Google Scholar 

  6. Parsons J.D., Brenner S., Bishop M.J. (1992) Clustering cDNA Sequences. Computational Applications in Bioscience 8:461–466

    Google Scholar 

  7. Schuler G.D. (1997) Pieces of the puzzle: expressed sequence tags and the catalog of human genes. Journal of Molecular Medicine 75:694–698

    Article  Google Scholar 

  8. Venter J.C., Adams M.D., Myers E.W., Li P.W., Mural R.J., Sutton G.G., et al. (2001) The sequence of the human genome. Science 291:1304–1351

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pedretti, K., Scheetz, T., Braun, T., Roberts, C., Robinson, N., Casavant, T. (2001). A Parallel Expressed Sequence Tag (EST) Clustering Program. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2001. Lecture Notes in Computer Science, vol 2127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44743-1_51

Download citation

  • DOI: https://doi.org/10.1007/3-540-44743-1_51

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42522-9

  • Online ISBN: 978-3-540-44743-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics