Abstract
Mining sequential rules requires specifying parameters that are often difficult to set (the minimal confidence and minimal support). Depending on the choice of these parameters, current algorithms can become very slow and generate an extremely large amount of results or generate too few results, omitting valuable information. This is a serious problem because in practice users have limited resources for analyzing the results and thus are often only interested in discovering a certain amount of results, and fine-tuning the parameters can be very time-consuming. In this paper, we address this problem by proposing TopSeqRules, an efficient algorithm for mining the top-k sequential rules from sequence databases, where k is the number of sequential rules to be found and is set by the user. Experimental results on real-life datasets show that the algorithm has excellent performance and scalability.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Laxman, S., Sastry, P.: A survey of temporal data mining. Sadhana 3, 173–198 (2006)
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. ICDE 1995, pp. 3–14 (1995)
Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42(1/2), 31–60 (2001)
Jonassen, I., Collins, J.F., Higgins, D.G.: Finding flexible patterns in unaligned protein sequences. Protein Science 4(8), 1587–1595 (1995)
Fournier-Viger, P., Nkambou, R., Tseng, V.S.: RuleGrowth: Mining Sequential Rules Common to Several Sequences by Pattern-Growth. In: Proc. SAC 2011, pp. 954–959 (2011)
Fournier-Viger, P., Faghihi, U., Nkambou, R., Mephu Nguifo, E.: CMRules: Mining Sequential Rules Common to Several Sequences. Knowledge-based Systems 25(1), 63–76 (2012)
Das, G., Lin, K.-I., Mannila, H., Renganathan, G., Smyth, P.: Rule Discovery from Time Series. In: Proc. ACM SIGKDD 1998, pp. 16–22 (1998)
Deogun, J.S., Jiang, L.: Prediction Mining – An Approach to Mining Association Rules for Prediction. In: Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W.P., Hu, X. (eds.) RSFDGrC 2005, Part II. LNCS (LNAI), vol. 3642, pp. 98–108. Springer, Heidelberg (2005)
Hamilton, H.J., Karimi, K.: The TIMERS II Algorithm for the Discovery of Causality. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 744–750. Springer, Heidelberg (2005)
Harms, S.K., Deogun, J.S., Tadesse, T.: Discovering Sequential Association Rules with Constraints and Time Lags in Multiple Sequences. In: Hacid, M.-S., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds.) ISMIS 2002. LNCS (LNAI), vol. 2366, pp. 432–441. Springer, Heidelberg (2002)
Lo, D., Khoo, S.-C., Wong, L.: Non-redundant sequential rules – Theory and algorithm. Information Systems 34(4-5), 438–453 (2009)
Mannila, H., Toivonen, H., Verkano, A.I.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1(1), 259–289 (1997)
Wang, J., Han, J., Lu, Y., Tzvetkov, P.: TFP: An Efficient Algorithm for Mining Top-k Frequent Closed Itemsets. IEEE TKDE 17(5), 652–664 (2005)
Pietracaprina, A., Vandin, F.: Efficient Incremental Mining of top-K Frequent Closed Itemsets. In: Corruble, V., Takeda, M., Suzuki, E. (eds.) DS 2007. LNCS (LNAI), vol. 4755, pp. 275–280. Springer, Heidelberg (2007)
Tzvetkov, P., Yan, X., Han, J.: TSP: Mining top-k closed sequential patterns. Knowledge and Information Systems 7(4), 438–457 (2005)
Chuang, K.-T., Huang, J.-L., Chen, M.-S.: Mining top-k frequent patterns in the presence of the memory constraint. VLDB 17(5), 1321–1344 (2008)
Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Information Systems 29(4), 293–313 (2004)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. MIT Press (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fournier-Viger, P., Tseng, V.S. (2011). Mining Top-K Sequential Rules. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25856-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-25856-5_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25855-8
Online ISBN: 978-3-642-25856-5
eBook Packages: Computer ScienceComputer Science (R0)