Abstract.
The parallelization of a Modified PrefixSpan method is proposed in this paper. The Modified PrefixSpan method is used to extract the frequent pattern from a sequence database. This system developed by authors requires the use of multiple computers connected in local area network. This system, which has a dynamic load balancing mechanism, is achieved through communication among multiple computers using a socket and an MPI library. It also includes multi-threads to achieve communication between a master process and multiple slave processes. The master process controls both the global job pool, to manage the set of subtrees generated in the initial processing and multiple slave processes. The results obtained here indicated that 8 computers were approximately 6 times faster than 1 computer in trial implementation experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of International Conference on Data Engineering (ICDE 2001), pp. 215–224. IEEE Computer Society Press, Los Alamitos (2001)
Kitakami, H., Kanbara, T., Mori, Y., Kuroki, S., Yamazaki, Y.: Modified PrefixSpan method for Motif Discovery in Sequence Databases. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 482–491. Springer, Heidelberg (2002)
Bailey, T.L., Elkan, C.: Fitting a Mixture Model by Expectation Maximization to Discover Motifs in Biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28–36. AAAI Press, Menlo Park (1994)
Bairoch, A., Bucher, P., Hofman, K.: The PROSITE Database: Its Status in 1995. Nucleic Acids Research 24, 189–196 (1996)
Sonnhamer, E.L.L., Eddy, S.R., Durbin, R.: Pfam: A Comprehensive Database of Proteins, vol. 28, pp. 405–420 (1997)
Jonassen, I., Collins, J.F., Higgins, D.G.: Finding Flexible Patterns in Unaligned Protein Sequences, Protein Science, pp. 1587–1595. Cambridge University Press, Cambridge (1995)
Rigoutsos, I., Floratos, A.: Combinatorial Pattern Discovery in Biological Sequences: The TEIRESIAS Algorithm. BIOINFORMATICS 14(1), 55–67 (1998)
Rigoutsos, I., Floratos, A.: Motif Discovery without Alignment or Enumeration. In: Proceedings of Second Annual ACM International Conference on Computational Molecular Biology (RECOMB 1998), March 1998, pp. 221–227 (1998)
Floratos, A., Rigoutsos, I.: On the Time Complexity of the TERIESIAS Algorithm, IBM Research Report, RC 21161(94582) (April 1998)
Araki, T., Murai, H., Kamachi, T., Seo, Y.: Implementation and Evaluation of Dynamic Load Balancing Mechanism for a Data Parallel Language, Information Processing Society of Japan: vol. 43(SIG 6(HPS5)) Transactions on High Performance Computing System, pp. 66–75 (September 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sutou, T., Tamura, K., Mori, Y., Kitakami, H. (2003). Design and Implementation of Parallel Modified PrefixSpan Method. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds) High Performance Computing. ISHPC 2003. Lecture Notes in Computer Science, vol 2858. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39707-6_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-39707-6_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20359-9
Online ISBN: 978-3-540-39707-6
eBook Packages: Springer Book Archive