Abstract
Lists of ordered objects are widely used as representational forms. Such ordered objects include Web search results or best-seller lists. Clustering is a useful data analysis technique for grouping mutually similar objects. To cluster orders, hierarchical clustering methods have been used together with dissimilarities defined between pairs of orders. However, hierarchical clustering methods cannot be applied to large-scale data due to their computational cost in terms of the number of orders. To avoid this problem, we developed an k-o’means algorithm. This algorithm successfully extracted grouping structures in orders, and was computationally efficient with respect to the number of orders. However, it was not efficient in cases where there are too many possible objects yet. We therefore propose a new method (k-o’means-EBC), grounded on a theory of order statistics. We further propose several techniques to analyze acquired clusters of orders.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Luaces, O., Bayón, G.F., Quevedo, J.R., Díez, J., del Coz, J.J., Bahamonde, A.: Analyzing sensory data using non-linear preference learning with feature subset selection. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 286–297. Springer, Heidelberg (2004)
Fujibuchi, W., Kiseleva, L., Horton, P.: Searching for similar gene expression profiles across platforms. In: Proc. of the 16th Int’l Conf. on Genome Informatics, p. 143 (2005)
Everitt, B.S.: Cluster Analysis, 3rd edn. Edward Arnold (1993)
Marden, J.I.: Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability, vol. 64. Chapman & Hall, Boca Raton (1995)
Branting, L.K., Broos, P.S.: Automated acquisition of user preference. Int’l Journal of Human-Computer Studies 46, 55–77 (1997)
Joachims, T.: Optimizing search engines using clickthrough data. In: Proc. of The 8th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 133–142 (2002)
Olson, C.F.: Parallel algorithms for hierarchical clustering. Parallel Computing 21, 1313–1325 (1995)
Kamishima, T., Fujiki, J.: Clustering orders. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 194–207. Springer, Heidelberg (2003)
Kendall, M., Gibbons, J.D.: Rank Correlation Methods, 5th edn. Oxford University Press, Oxford (1990)
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the Web. In: Proc. of The 10th Int’l Conf. on World Wide Web, pp. 613–622 (2001)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Fligner, M.A., Verducci, J.S.: Distance based ranking models. Journal of The Royal Statistical Society (B) 48(3), 359–369 (1986)
Thurstone, L.L.: A law of comparative judgment. Psychological Review 34, 273–286 (1927)
Mosteller, F.: Remarks on the method of paired comparisons: I — the least squares solution assuming equal standard deviations and equal correlations. Psychometrika 16(1), 3–9 (1951)
de Borda, J.C.: On elections by ballot (1784). In: McLean, I., Urken, A.B. (eds.) Classics of Social Choice, pp. 81–89. The University of Michigan Press (1995)
Mallows, C.L.: Non-null ranking models. I. Biometrika 44, 114–130 (1957)
Arnold, B.C., Balakrishnan, N., Nagaraja, H.N.: A First Course in Order Statistics. John Wiley & Sons, Inc., Chichester (1992)
Kamishima, T., Motoyoshi, F.: Learning from cluster examples. Machine Learning 53, 199–233 (2003)
Kamishima, T.: Nantonac collaborative filtering: Recommendation based on order responses. In: Proc. of The 9th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 583–588 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kamishima, T., Akaho, S. (2009). Efficient Clustering for Orders. In: Zighed, D.A., Tsumoto, S., Ras, Z.W., Hacid, H. (eds) Mining Complex Data. Studies in Computational Intelligence, vol 165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88067-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-88067-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88066-0
Online ISBN: 978-3-540-88067-7
eBook Packages: EngineeringEngineering (R0)