Abstract
A critical problem related to kernel-based methods is how to select optimal kernels. A kernel function must conform to the learning target in order to obtain meaningful results. While solutions to the problem of estimating optimal kernel functions and corresponding parameters have been proposed in a supervised setting, it remains a challenge when no labeled data are available, and all we have is a set of pairwise must-link and cannot-link constraints. In this paper, we address the problem of optimizing the kernel function using pairwise constraints for semi-supervised clustering. We propose a new optimization criterion for automatically estimating the optimal parameters of composite Gaussian kernels, directly from the data and given constraints. We combine our proposal with a semi-supervised kernel-based algorithm to demonstrate experimentally the effectiveness of our approach. The results show that our method is very effective for kernel-based semi-supervised clustering.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Amini MR, Gallinari P (2005) Semi-supervised learning with an imperfect supervisor. Knowl Inf Syst 8: 385–413
Bach F, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: International conference on machine learning, pp 41–48
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: International conference on machine learning, pp 11–18
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: International conference on knowledge discovery and data mining, pp 59–68
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Blaszczyk J, Karbowski A, Malinowski K (2007) Object library of algorithms for dynamic optimization problems: benchmaring SQP and nonlinear interior point methods. Int J Appl Math Comput Sci 17(4): 515–537
Chapelle O, Vapnik V (2002) Choosing mutiple parameters for support vector machines. Machine Learn 46(1): 131–159
Chen Y, Rege M, Dong M, Hua J (2008) Non-negative matrix factorization for semi-supervised data clustering. Knowl Inf Syst 17: 355–379
Cohn D, Caruana R, McCallum A (2003) Semi-supervised clustering with user feedback. TR2003-1892, Cornell University
Cristianini N, Shawe-Taylor J, Elisseeff A (2001) On kernel-target alignment, neural information processing systems 14. MIT Press, Cambridge, pp 367–373
Davis J, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: International conference on machine learning, pp 209–216
Han SP (2007) A globally convergent method for nonlinear programming. J Optim Theory Appl 22(3): 297–309
Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, Berlin
Huang J, Yuen PC, Chen WS, Lai JH (2004) Kernel subspace LDA with optimized kernel parameters on face recognition. In: The sixth IEEE international conference on automatic face and gesture recognition, pp 327–332
Kondor R, Jebara T (2007) Gaussian and Wishart hyperkernels. In: Advances in neural information processing systems vol 19. MIT Press, Cambridge, pp 729–736
Kulis B, Basu S, Dhillon I, Moony R (2005) Semi-supervised graph clustering: a kernel approach. In: International conference on machine learning, pp 457–464
Lanckriet GRG, Cristianini N, Bartlett P, El Ghaoui L, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Machine Learn Res 5: 27–72
Lanckriet GRG, De Bie T, Cristianini N, Jordan MI, Noble WS (2004) A statistical framework for genomic data fusion. Bioinformatics 20(16): 2626–2635
Li T, Ogihara M (2004) Semisupervised learning from different information sources. Knowl Inf Syst 7: 289–309
Ong CS, Smola AJ, Williamson RC (2003) Hyperkernels. In: Advances in neural information processing systems vol 15. MIT Press, Cambridge, pp 478–485
Powell MJD (1978) A fast algorithm for nonlinearly constrained optimization calculations. In: Watson GA (ed) Numerical analysis. Lecture notes in mathematics, Springer, Berlin 630:144–157
Powell MJD (1978) The convergence of variable metric methods for nonlinearly constrained optimization calculations. In: Mangasarian OL, Meyer RR, Robinson SM (eds) Nonlinear programming 3. Academic Press, New York
Shawe-Tayor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambrige University, Cambrige
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Machine Learn Res 3(3): 583–617
Theodoridis S, Koutroubas K (1999) Pattern recognition. Academic Press, New York
Vapnik V (1995) The nature of statistical learning theory. Wiley, New York
Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained K-Means clustering with background knowledge. In: International conference on machine learning, pp 577–584
Wang W, Xu Z, Lu W, Zhang X (2002) Determination of the spread parameter in the Gaussian kernel for classification and regression. Neurocomputing 55(3–4): 643–663
Xing EP, Ng AY, Jordan MI, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems vol 15. MIT Press, Cambridge, pp 505–512
Yan B, Domeniconi C (2006) An adaptive kernel method for semi-supervised clustering. In: European conference on machine learning, pp 521–532
Ye J, Chen K, Wu T, Li J, Zhao Z, Patel R, Bae M, Janardan R, Liu H, Alexander G, Reiman E (2008) Heterogeneous data fusion for Alzheimer’s disease study. In: The 4th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1025–1033
Ye J, Ji S, Chen J (2008) Multi-class discriminant kernel learning via convex programming. J Machine Learn Res 9: 719–758
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Domeniconi, C., Peng, J. & Yan, B. Composite kernels for semi-supervised clustering. Knowl Inf Syst 28, 99–116 (2011). https://doi.org/10.1007/s10115-010-0318-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-010-0318-8