Abstract
We propose an interestingness measure for groups of classification rules which are mutually related based on the Minimum Description Length Principle. Unlike conventional methods, our interestingness measure is based on a theoretical background, has no parameter, is applicable to a group of any number of rules, and can exploit an initial hypothesis. We have integrated the interestingness measure with practical heuristic search and built a rule-group discovery method CLARDEM (Classification Rule Discovery method based on an Extended-Mdlp). Extensive experiments using both real and artificial data confirm that CLARDEM can discover the correct concept from a small noisy data set and an approximate initial concept with high “discovery accuracy”.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baram, Y.: Partial Classification: The Benefit of Deferred Decision. IEEE Trans. Pattern Analysis and Machine Intelligence 20(8), 769–776 (1998)
Blake, C., Merz, C.J., Keogh, E.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html
Jaroszewicz, S., Simovici, D.A.: Interestingness of Frequent Itemsets Using Bayesian Networks as Background Knowledge. In: Proc. Tenth ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), pp. 178–186 (2004)
Padmanabhan, B., Tuzhilin, A.: Small is Beautiful: Discovering the Minimal Set of Unexpected Patterns. In: Proc. KDD, pp. 54–63 (2000)
Grünwald, P.D.: The Minimum Description Length Principle. MIT Press, Cambridge (2007)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Quinlan, J.R.: Learning Logical Definitions from Relations. Machine Learning 5(3), 239–266 (1990)
Quinlan, J.R., Rivest, R.L.: Inferring Decision Trees Using the Minimum Description Length Principle. Information and Computation 80(3), 227–248 (1989)
Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)
Shannon, C.: A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423, 623–656 (1948)
Siebes, A., Vreeken, J., van Leeuwen, M.: Item Sets that Compress. In: 2006 SIAM Conference on Data Mining (SDM), pp. 393–404 (2006)
Smyth, P., Goodman, R.M.: An Information Theoretic Approach to Rule Induction from Databases. IEEE TKDE 4(4), 301–316 (1992)
Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the Right Interestingness Measure for Association Patterns. In: Proc. KDD, pp. 32–41 (2002)
Tangkitvanich, S., Shimura, M.: Learning from an Approximate Theory and Noisy Examples. In: Proc. AAAI, pp. 466–471 (1993)
Wallace, C.S., Patrick, J.D.: Coding Decision Trees. Machine Learning 11(1), 7–22 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Suzuki, E. (2009). Negative Encoding Length as a Subjective Interestingness Measure for Groups of Rules. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-01307-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01306-5
Online ISBN: 978-3-642-01307-2
eBook Packages: Computer ScienceComputer Science (R0)