Abstract
Previous research has resulted in a number of different algorithms for rule discovery. Two approaches discussed here, the ‘all-rules’ algorithm and multi-objective metaheuristics, both result in the production of a large number of partial classification rules, or ‘nuggets’, for describing different subsets of the records in the class of interest. This paper describes the application of a number of different clustering algorithms to these rules, in order to identify similar rules and to better understand the data.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Bayardo, Jr., R. J. and Agrawal, R.: Mining the most interesting rules, in S. Chaudhuri and D. Madigan (eds.), Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, United States, 1999, pp. 145–154.
Bayardo, Jr., R. J., Agrawal, R. and Gunopulos, D.: Constraint-based rule mining in large, dense databases, in Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, 1999, pp. 188–197.
Blake, C. and Merz, C.: ‘UCI Repository of machine learning databases,' (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html.
Chu, S. C., Roddick, J. F. and Pan, J. S.: A comparative study and extensions to k-medoids algorithms, in Fifth International Conference on Optimization, Hong Kong, China, 2001, pp. 1708–1717.
de la Iglesia, B., Philpott, M. S., Bagnall, A. J. and Rayward-Smith, V. J.: Data mining rules using multi-objective evolutionary algorithms, in R. Sarker, R. Reynolds, H. Abbass, K. C. Tan, B. McKay, D. Essam, and T. Gedeon (eds.), Proceedings of 2003 IEEE Congress on Evolutionary Computation, Canberra, Australia, 2003, pp. 1552–1559.
de la Iglesia, B., Reynolds, A. and Rayward-Smith, V. J.: Developments on a Multi-Objective Metaheuristic (MOMH) algorithm for finding interesting sets of classification rules, in C. A. Coello Coello, A. H. Aguirre and E. Zitzler (eds.), Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 2005, pp. 826–840.
Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms, Chichester, Wiley, England, 2001.
Deb, K., Agrawal, S., Pratab, A. and Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II,' in Marc Schoenauer and Kalyanmoy Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, J. J. Merelo, Hans-Paul Schwefel (eds.), Proceedings of the Parallel Problem Solving from Nature VI Conference. Lecture Notes in Computer Science No. 1917, Paris, France, 2000, pp. 849–858.
Gower, J. C. and Legendre, P.: Metric and Euclidean properties of dissimilarity coefficients, J. Classif. 3 (1986), 5–48.
Handl, J. and Knowles, J.: Evolutionary multiobjective clustering, in X. Yao, E. Burke, J. Lozano, J. Smith, J. Merelo-Guervs, J. Bullinaria, J. Rowe, P. Tino, A. Kabn, and H.-P. Schwefel (eds.), Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature (PPSN VIII). Birmingham, UK, 2004, pp. 1081–1091.
Handl, J. and Knowles, J.: Exploiting the trade-off – the benefits of multiple objectives in data clustering, in C. A. Coello Coello, A. H. Aguirre and E. Zitzler (eds.), Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 2005, pp. 547–560.
Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura, Bull. Soc. Vaud. Sci. Nat. 37 (1901), 547–579.
Kaufman, L. and Rousseeuw, P. J.: Finding Groups in Data: An Introduction to Cluster Analysis, Wiley series in probability and mathematical statistics, Wiley, 1990.
MacQueen, J. B.: Some methods for classification and analysis of multivariate observations, in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 1967, pp. 281–297.
Ng, R. T. and Han, J.: CLARANS: a method for clustering objects for spatial data mining, IEEE Trans. Knowl. Data Eng. 14(5) (2002), 1003–1016.
Reynolds, A. P., Richards, G. and Rayward-Smith, V. J.: The application of K-medoids and PAM to the clustering of rules, in Z. R. Yang, H. Yin, and R. Everson (eds.), in Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'04), 2004, pp. 173–178.
Richards, G. and Rayward-Smith, V. J.: Discovery of association rules in tabular data, in N. Cercone, T. Y. Lin and X. Wu (eds.), in Proceedings of IEEE First International Conference on Data Mining, San Jose, California, USA, San Jose, California, 2001, pp. 465–473.
Sokal, R. R. and Michener, C. D.: A statistical method for evaluating systematic relationships, Univ. Kans. Sci. Bull. 38 (1958), 1409–1438.
Sokal, R. R. and Sneath, P. H. A.: Principles of Numerical Taxonomy, Freeman, San Francisco, 1963.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Reynolds, A.P., Richards, G., de la Iglesia, B. et al. Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms. J Math Model Algor 5, 475–504 (2006). https://doi.org/10.1007/s10852-005-9022-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10852-005-9022-1