A Fast Algorithm to Find All the Maximal Frequent Sequences in a Text

García-Hernández, René A.; Martínez-Trinidad, José Fco.; Carrasco-Ochoa, Jesús Ariel

doi:10.1007/978-3-540-30463-0_60

René A. García-Hernández¹⁹,
José Fco. Martínez-Trinidad¹⁹ &
Jesús Ariel Carrasco-Ochoa¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3287))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

1288 Accesses
10 Citations

Abstract

One of the sequential pattern mining problems is to find the maximal frequent sequences in a database with a β support. In this paper, we propose a new algorithm to find all the maximal frequent sequences in a text instead of a database. Our algorithm in comparison with the typical sequential pattern mining algorithms avoids the joining, pruning and text scanning steps. Some experiments have shown that it is possible to get all the maximal frequent sequences in a few seconds for medium texts.

Download to read the full chapter text

Chapter PDF

Mining sequential patterns with itemset constraints

Article 01 February 2018

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

WS-Miner: A Fast Weighted Sequential Pattern Mining Algorithm

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Fayyad, U., Piatetsky-Shapiro, G.: Advances in Knowledge Discovery and Data mining. AAAI Press, Menlo Park (1996)
Google Scholar
Feldman, R., Dagan, I.: Knowledge Discovery in Textual Databases (KDT). In: Proceedings of the 1st International Conference on Knowledge Discovery, KDD 1995 (1995)
Google Scholar
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceedings of the International Conference on Data Engineering (1995)
Google Scholar
Lin, Dao-I. Fast Algorithms for Discovering the Maximum Frequent Set, Ph. Thesis, New York University (1998)
Google Scholar
Pei, J.H., et al.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc International Conference on Data Engineering, ICDE 2001 (2001)
Google Scholar
Zaki, M.j.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. In: Machine Learning, Kluwer Academic Publishers, Dordrecht (2000)
Google Scholar
Ahonen, H.: Finding All Maximal Frequent Sequences in Text. In: ICML 1999 Workshop: Machine Learning in Text Data (1999)
Google Scholar
Antunes, C., Oliveira, A.: Generalization of Pattern-growth Methods for Sequential Pattern Mining with Gap Constraints. In: Third IAPR Workshop on Machine Learning and Data Mining MLDM 2003 (2003)
Google Scholar
Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: 5th Intl. Conf. Extending Database Discovery and Data Mining (1996)
Google Scholar
Public domain documents from American and English literature as well as Western philosophy, http://www.infomotions.com/alex/

Download references

Author information

Authors and Affiliations

National Institute of Astrophysics, Optics and Electronics (INAOE), Puebla, México
René A. García-Hernández, José Fco. Martínez-Trinidad & Jesús Ariel Carrasco-Ochoa

Authors

René A. García-Hernández
View author publications
You can also search for this author in PubMed Google Scholar
José Fco. Martínez-Trinidad
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Ariel Carrasco-Ochoa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. System Engineering and Automation, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
Alberto Sanfeliu
Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840, Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez Trinidad
Computer Science Department, National Institute of Astrophysics, Optics and Electronics, (INAOE), Luis Enrique Erro No.1, 72840, Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco Ochoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

García-Hernández, R.A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A. (2004). A Fast Algorithm to Find All the Maximal Frequent Sequences in a Text. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2004. Lecture Notes in Computer Science, vol 3287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30463-0_60

Download citation

DOI: https://doi.org/10.1007/978-3-540-30463-0_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23527-9
Online ISBN: 978-3-540-30463-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

A Fast Algorithm to Find All the Maximal Frequent Sequences in a Text

Abstract

Chapter PDF

Similar content being viewed by others

Mining sequential patterns with itemset constraints

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

WS-Miner: A Fast Weighted Sequential Pattern Mining Algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

A Fast Algorithm to Find All the Maximal Frequent Sequences in a Text

Abstract

Chapter PDF

Similar content being viewed by others

Mining sequential patterns with itemset constraints

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

WS-Miner: A Fast Weighted Sequential Pattern Mining Algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation