Higher Order Contractive Auto-Encoder

Rifai, Salah; Mesnil, Grégoire; Vincent, Pascal; Muller, Xavier; Bengio, Yoshua; Dauphin, Yann; Glorot, Xavier

doi:10.1007/978-3-642-23783-6_41

Salah Rifai²³,
Grégoire Mesnil^23,24,
Pascal Vincent²³,
Xavier Muller²³,
Yoshua Bengio²³,
Yann Dauphin²³ &
…
Xavier Glorot²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6912))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

6027 Accesses
182 Citations

Abstract

We propose a novel regularizer when training an auto-encoder for unsupervised feature extraction. We explicitly encourage the latent representation to contract the input space by regularizing the norm of the Jacobian (analytically) and the Hessian (stochastically) of the encoder’s output with respect to its input, at the training points. While the penalty on the Jacobian’s norm ensures robustness to tiny corruption of samples in the input space, constraining the norm of the Hessian extends this robustness when moving further away from the sample. From a manifold learning perspective, balancing this regularization with the auto-encoder’s reconstruction objective yields a representation that varies most when moving along the data manifold in input space, and is most insensitive in directions orthogonal to the manifold. The second order regularization, using the Hessian, penalizes curvature, and thus favors smooth manifold. We show that our proposed technique, while remaining computationally efficient, yields representations that are significantly better suited for initializing deep architectures than previously proposed approaches, beating state-of-the-art performance on a number of datasets.

Download to read the full chapter text

Chapter PDF

Learning a good representation with unsymmetrical auto-encoder

Article 24 July 2015

Semi Supervised Autoencoders: Better Focusing Model Capacity during Feature Extraction

Composite Denoising Autoencoders

Keywords

References

Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1), 1–127 (2009); Also published as a book. Now Publishers (2009)
Article MATH Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: NIPS, vol. 19, pp. 153–160. MIT Press, Cambridge (2007)
Google Scholar
Bengio, Y., Larochelle, H., Vincent, P.: Non-local manifold parzen windows. In: NIPS, vol. 18. MIT Press, Cambridge (2006)
Google Scholar
Bishop, C.M.: Curvature-driven smoothing: A learning algorithm for feedforward networks. IEEE Transactions on Neural Networks 5(4), 882–884 (1993)
Article Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Google Scholar
Cho, Y., Saul, L.: Kernel methods for deep learning. In: NIPS 2009, pp. 342–350, NIPS Foundation (2010)
Google Scholar
Coates, A., Lee, H., Ng, A.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011), JMLR W&CP (2011)
Google Scholar
Goodfellow, I., Le, Q., Saxe, A., Ng, A.: Measuring invariances in deep networks. In: NIPS 2009, pp. 646–654 (2009)
Google Scholar
Hastie, T., Tibshirani, R.: Generalized Additive Models. Chapman and Hall, Boca Raton (1990)
MATH Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Kavukcuoglu, K., Ranzato, M., Fergus, R., LeCun, Y.: Learning invariant features through topographic filter maps. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR 2009), IEEE, Los Alamitos (2009)
Google Scholar
Larochelle, H., Erhan, D., Courville, A., Bergstra, J., Bengio, Y.: An empirical evaluation of deep architectures on problems with many factors of variation. In: Ghahramani, Z. (ed.) ICML 2007, pp. 473–480. ACM, New York (2007)
Google Scholar
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Research 37, 3311–3325 (1997)
Article Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: NIPS 2006 (2007)
Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: NIPS 2006, pp. 1137–1144. MIT Press, Cambridge (2007)
Google Scholar
Rifai, S., Muller, X., Mesnil, G., Bengio, Y., Vincent, P.: Learning invariant features through local space contraction. Technical Report 1360, Département d’informatique et recherche opérationnelle, Université de Montréal (2011)
Google Scholar
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contracting auto-encoders: Explicit invariance during feature extraction. In: Proceedings of the Twenty-eight International Conference on Machine Learning, ICML 2011 (2011)
Google Scholar
Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. In: AISTATS 2009, vol. 5, pp. 448–455 (2009)
Google Scholar
Simard, P., Victorri, B., LeCun, Y., Denker, J.: Tangent prop - A formalism for specifying selected invariances in an adaptive network. In: NIPS 1991, pp. 895–903. Morgan Kaufmann, San Francisco (1992)
Google Scholar
Swersky, K., Ranzato, M., Buchman, D., Marlin, B., de Freitas, N.: On score matching for energy based models: Generalizing autoencoders and simplifying deep learning. In: Proc. ICML 2011, ACM Press, New York (2011)
Google Scholar
Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-posed Problems. W. H. Winston, Washington D.C (1977)
MATH Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: ICML 2008, pp. 1096–1103. ACM, New York (2008)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR 1, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Wahba, G.: Spline models for observational data. In: CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990)
Google Scholar
Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) ICML 2008, pp. 1168–1175. ACM, New York (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. IRO, Université de Montréal, Montréal, QC, H2C 3J7, Canada
Salah Rifai, Grégoire Mesnil, Pascal Vincent, Xavier Muller, Yoshua Bengio, Yann Dauphin & Xavier Glorot
LITIS EA 4108, Université de Rouen, 768000, Saint Etienne du Rouvray, France
Grégoire Mesnil

Authors

Salah Rifai
View author publications
You can also search for this author in PubMed Google Scholar
Grégoire Mesnil
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Vincent
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Muller
View author publications
You can also search for this author in PubMed Google Scholar
Yoshua Bengio
View author publications
You can also search for this author in PubMed Google Scholar
Yann Dauphin
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Glorot
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Telecommunications, University of Athens, Panepistimioupolis, Ilisia, 15784, Athens, Greece
Dimitrios Gunopulos
Google Switzerland GmbH, Brandschenkestrasse 110, 8002, Zurich, Switzerland
Thomas Hofmann
Department of Computer Science, University of Bari “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Donato Malerba
Deptartment of Informatics, Athens University of Economics and Business, Patision 76, 10434, Athens, Greece
Michalis Vazirgiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rifai, S. et al. (2011). Higher Order Contractive Auto-Encoder. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23783-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-23783-6_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23782-9
Online ISBN: 978-3-642-23783-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Higher Order Contractive Auto-Encoder

Abstract

Chapter PDF

Similar content being viewed by others

Learning a good representation with unsymmetrical auto-encoder

Semi Supervised Autoencoders: Better Focusing Model Capacity during Feature Extraction

Composite Denoising Autoencoders

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Higher Order Contractive Auto-Encoder

Abstract

Chapter PDF

Similar content being viewed by others

Learning a good representation with unsymmetrical auto-encoder

Semi Supervised Autoencoders: Better Focusing Model Capacity during Feature Extraction

Composite Denoising Autoencoders

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation