Abstract
Assessing rules with interestingness measures is the pillar of successful application of association rules discovery. However, association rules discovered are large in number, some of which are not considered as interesting or significant for the application at hand. In this paper, we present a systematic approach to ascertain the discovered rules, and provide a precise statistical approach supporting this framework. Furthermore, considering that many interestingness measures exist, we propose and compare two established approaches in selecting relevant attributes for the rules prior to rule generation. The proposed strategy combines data mining and statistical measurement techniques, including redundancy analysis, sampling and multivariate statistical analysis, to discard the non-significant rules. In addition to that, we consider real world datasets which are characterized by the uniform and non-uniform data/items distribution with mixture of measurement level throughout the data/items. The proposed unified framework is applied on these datasets to demonstrate its effectiveness in discarding many of the redundant or non-significant rules, while still preserving the high accuracy of the rule set as a whole.
Chapter PDF
Similar content being viewed by others
References
Han, J., Kamber, M.: Data mining : concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2001)
McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20, 39–61 (2005)
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM Comput. Surv. 38, 9 (2006)
Zhang, H., Padmanabhan, B., Tuzhilin, A.: On the discovery of significant statistical quantitative rules. In: Proceedings of the 10th ACM SIGKDD International Conference On Knowledge Discovery And Data Mining. ACM, New York (2004)
Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD Rec., vol. 22, pp. 207–216 (1993)
Shaharanee, I.N.M., Hadzic, F., Dillon, T.S.: Interestingness of association rules using symmetrical tau and logistic regression. In: Nicholson, A., Li, X. (eds.) AI 2009. LNCS, vol. 5866, pp. 422–431. Springer, Heidelberg (2009)
Bing, L., Wynne, H., Yiming, M.: Mining association rules with multiple minimum supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, California (1999)
Yun, H., Ha, D., Hwang, B.: Ho Ryu, K.: Mining association rules on significant rare data using relative support. Journal of Systems and Software 67, 181–191 (2003)
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Alberta (2002)
Webb, G.I.: Discovering Significant Patterns. Machine Learning, 1–33 (2007)
Goodman, A., Kamath, C., Kumar, V.: Data Analysis in the 21st Century. Stat. Anal. Data Min. 1, 1–3 (2008)
Aumann, Y., Lindell, Y.: A Statistical Theory for Quantitative Association Rules. J. Intell. Inf. Syst. 20, 255–283 (2003)
Shaharanee, I.N.M., Dillon, T.S., Hadzic, F.: Ascertaining association rules using statistical analysis. In: Proceeding of the 2009 International Symposium on Computing, Communication and Control, Singapore (2009)
Philippe, L., Patrick, M., Benoît, V., Stéphane, L.: On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid. European Journal of Operational Research 184, 610–626 (2008)
Aggarwal, C.C., Yu, P.S.: A new framework for itemset generation. In: Book a new framework for itemset generation. Series A new framework for itemset generation. ACM, New York (1998)
Toivonen, H.: Sampling large databases for association rules. In: Proceedings of the 22th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc. (1996)
Lavrač, N., Flach, P.A., Zupan, B.: Rule evaluation measures: a unifying view. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 174–185. Springer, Heidelberg (1999)
Cheng, H., Yan, X., Han, J., S., Y.P.: Direct discriminative pattern mining for effective classification. In: Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, pp. 169–178 (2008)
Zhou, X.J., Dillon, T.S.: A statistical-heuristic feature selection criterion for decision tree induction. IEEE Transaction on Pattern Analysis and Machine Intelligence 13 (1991)
Julien, B., Fabrice, G., Regis, G., Henri, B.: Using information-theoretic measures to assess association rule interestingness. In: Proceedings of the 5th IEEE International Conference on Data Mining. IEEE Computer Society (2005)
Lotfi, S., Sadreddini, M.H.: Mining fuzzy association rules using mutual information. In: International MultiConference of Engineers and Computer Scientists, vol. 1, Hong Kong (2009)
Agresti, A.: An Intro to Categorical Data Analysis. Wiley-Interscience, New York (2007)
Hosmer, D.W., Lemeshow, S.: Applied logistic regression. Wiley, New York (1989)
Dillon, T.S., Hossain, T., Bloomer, W., Witten, M.: Improvements in supervised BRAINNE: a method for symbolic data mining using neural networks. In: Seventh Conference on Database Semantics, vol. 124, pp. 67–88. Chapman & Hall, Switzerland (1998)
Shaharanee, I., Hadzic, F.: Evaluation and optimization of frequent, closed and maximal association rule based classification. Stat. Comput. 23, 1–23 (2013)
Shaharanee, I., Hadzic, F., Dillon, T.: Interestingness measures for association rules based on statistical validity. Knowl.-Based Syst. 24(3), 386–392 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 IFIP International Federation for Information Processing
About this paper
Cite this paper
Shaharanee, I.N.M., Jamil, J.M. (2015). A Framework for Interestingness Measures for Association Rules with Discrete and Continuous Attributes Based on Statistical Validity. In: Dillon, T. (eds) Artificial Intelligence in Theory and Practice IV. IFIP AI 2015. IFIP Advances in Information and Communication Technology, vol 465. Springer, Cham. https://doi.org/10.1007/978-3-319-25261-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-25261-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25260-5
Online ISBN: 978-3-319-25261-2
eBook Packages: Computer ScienceComputer Science (R0)