Abstract
In this paper, we address the issue of learning nonlinearly separable concepts with a kernel classifier in the situation where the data at hand are altered by a uniform classification noise. Our proposed approach relies on the combination of the technique of random or deterministic projections with a classification noise tolerant perceptron learning algorithm that assumes distributions defined over finite-dimensional spaces. Provided a sufficient separation margin characterizes the problem, this strategy makes it possible to envision the learning from a noisy distribution in any separable Hilbert space, regardless of its dimension; learning with any appropriate Mercer kernel is therefore possible. We prove that the required sample complexity and running time of our algorithm is polynomial in the classical PAC learning parameters. Numerical simulations on toy datasets and on data from the UCI repository support the validity of our approach.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Schölkopf, B., Smola, A.J.: Learning with Kernels, Support Vector Machines, Regularization, Optimization and Beyond. MIT University Press, Cambridge (2002)
Vapnik, V.: The nature of statistical learning theory. Springer, New York (1995)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Angluin, D., Laird, P.: Learning from Noisy Examples. Machine Learning 2 (1988)
Bylander, T.: Learning Linear Threshold Functions in the Presence of Classification Noise. In: Proc. of 7th Ann. Work. on Computat. Learning Theory, pp. 340–347 (1994)
Blum, A., Frieze, A.M., Kannan, R., Vempala, S.: A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions. In: Proc. of 37th IEEE Symposium on Foundations of Computer Science, pp. 330–338. IEEE Computer Society Press, Los Alamitos (1996)
Cohen, E.: Learning Noisy Perceptrons by a Perceptron in Polynomial Time. In: Proc. of 38th IEEE Symposium on Foundations of Computer Science, pp. 514–523. IEEE Computer Society Press, Los Alamitos (1997)
Bylander, T.: Learning Noisy Linear Threshold Functions (1998) (submitted to journal)
Rosenblatt, F.: The Perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 386–407 (1958)
Graepel, T., Herbrich, R., Williamson, R.C.: From Margin to Sparsity. In: Adv. in Neural Information Processing Systems, vol. 13, pp. 210–216 (2001)
Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, inc., West Sussex, England (1998)
Ralaivola, L., Denis, F., Magnan, C.N.: CN=CPCN. In: Proc. of the 23rd Int. Conf. on Machine Learning (2006)
Balcan, M.F., Blum, A., Vempala, S.: Kernels as Features: on Kernels, Margins, and Low-dimensional Mappings. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, Springer, Heidelberg (2004)
Kearns, M., Li, M.: Learning in the presence of malicious errors. SIAM Journal on Computing 22(4), 807–837 (1993)
Block, H.D.: The perceptron: A model for brain functioning. Reviews of Modern Physics 34, 123–135 (1962)
Novikoff, A.B.J.: On convergence proofs on perceptrons. In: Proc. of the Symp. on the Mathematical Theory of Automata, pp. 615–622 (1962)
Amaldi, E., Kann, V.: On the approximability of some NP-hard minimization problems for linear systems. Electronic Colloquium on Computational Complexity (ECCC) 3(015) (1996)
Zwald, L., Vert, R., Blanchard, G., Massart, P.: Kernel projection machine: a new tool for pattern recognition. In: Adv. in Neural Information Processing Systems, vol. 17 (2004)
Freund, Y., Schapire, R.E.: Large Margin Classification Using the Perceptron Algorithm. Machine Learning 37(3), 277–296 (1999)
Cesa-Bianchi, N., Conconi, A., Gentile, C.: On the generalization ability of online learning algorithms. IEEE Transactions on Information Theory 50(9), 2050–2057 (2004)
Fradkin, D., Madigan, D.: Experiments with random projections for machine learning. In: Proc. of the 9th ACM SIGKDD int. conf. on Knowledge discovery and data mining, ACM Press, New York (2003)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Rätsch, G., Onoda, T., Müller, K.R.: Soft Margins for AdaBoost. Machine Learning 42, 287–320 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stempfel, G., Ralaivola, L. (2007). Learning Kernel Perceptrons on Noisy Data Using Random Projections. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds) Algorithmic Learning Theory. ALT 2007. Lecture Notes in Computer Science(), vol 4754. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75225-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-75225-7_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75224-0
Online ISBN: 978-3-540-75225-7
eBook Packages: Computer ScienceComputer Science (R0)