Discriminative Mixture-of-Templates for Viewpoint Classification

Gu, Chunhui; Ren, Xiaofeng

doi:10.1007/978-3-642-15555-0_30

Chunhui Gu¹⁹ &
Xiaofeng Ren²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6315))

Included in the following conference series:

European Conference on Computer Vision

6480 Accesses
59 Citations

Abstract

Object viewpoint classification aims at predicting an approximate 3D pose of objects in a scene and is receiving increasing attention. State-of-the-art approaches to viewpoint classification use generative models to capture relations between object parts. In this work we propose to use a mixture of holistic templates (e.g. HOG) and discriminative learning for joint viewpoint classification and category detection. Inspired by the work of Felzenszwalb et al 2009, we discriminatively train multiple components simultaneously for each object category. A large number of components are learned in the mixture and they are associated with canonical viewpoints of the object through different levels of supervision, being fully supervised, semi-supervised, or unsupervised. We show that discriminative learning is capable of producing mixture components that directly provide robust viewpoint classification, significantly outperforming the state of the art: we improve the viewpoint accuracy on the Savarese et al 3D Object database from 57% to 74%, and that on the VOC 2006 car database from 73% to 86%. In addition, the mixture-of-templates approach to object viewpoint/pose has a natural extension to the continuous case by discriminatively learning a linear appearance model locally at each discrete view. We evaluate continuous viewpoint estimation on a dataset of everyday objects collected using IMUs for groundtruth annotation: our mixture model shows great promise comparing to a number of baselines including discrete nearest neighbor and linear regression.

Download to read the full chapter text

Chapter PDF

Untangling Object-View Manifold for Multiview Recognition and Pose Estimation

Viewpoint Estimation—Insights and Model

Efficient 2D viewpoint combination for human action recognition

Article 05 March 2016

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Koenderink, J., van Doorn, A.: The internal representation of solid shape with respect to vision. Biological Cybernetics 32, 211–216 (1979)
Article MATH Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int’l. J. Comp. Vision 60, 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI (2009)
Google Scholar
Savarese, S., Fei-Fei, L.: 3d generic object categorization, localization and pose estimation. In: ICCV (2007)
Google Scholar
Sun, M., Su, H., Savarese, S., Fei Fei, L.: A multi-view probabilistic model for 3d object classes. In: CVPR, pp. 1247–1254 (2009)
Google Scholar
Su, H., Sun, M., Fei-Fei, L., Savarese, S.: Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories. In: ICCV (2009)
Google Scholar
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006, VOC 2006 Results (2006), http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Arie-Nachmison, M., Basri, R.: Constructing implicit 3d shape models for pose estimation. In: ICCV (2009)
Google Scholar
Cyr, C., Kimia, B.: A similarity-based aspect-graph approach to 3d object recognition. Int’l. J. Comp. Vision 57, 5–22 (2004)
Article Google Scholar
Hoiem, D., Rother, C., Winn, J.: 3d layoutcrf for multi-view object class recognition and segmentation. In: CVPR (2007)
Google Scholar
Kushal, A., Schmid, C., Ponce, J.: Flexible object models for category-level 3d object recognition. In: CVPR (2004)
Google Scholar
Chiu, H., Kaelbling, L., Lozano-Perez, T.: Virtual-training for multi-view object class recognition. In: CVPR (2007)
Google Scholar
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. Int’l. J. Comp. Vision 66, 231–259 (2006)
Article Google Scholar
Berg, A., Berg, T., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: CVPR, vol. 1, pp. 26–33 (2005)
Google Scholar
Bulthoff, H., Edelman, S.: Psychophysical support for a two-dimensional view interpolation theory of object recognition. PNAS 89, 60–64 (1992)
Article Google Scholar
DeMenthon, D., Davis, L.: Model-based object pose in 25 lines of code. Int’l. J. Comp. Vision 15, 123–141 (1995)
Article Google Scholar
Lavallee, S., Szeliski, R.: Recovering the position and orientation of free-form objects from image contours using 3d distance maps. IEEE Trans. PAMI 17, 378–390 (1995)
Google Scholar
Collet, A., Berenson, D., Srinivasa, S., Ferguson, D.: Object recognition and full pose registration from a single image for robotic manipulation. In: ICRA (2009)
Google Scholar
Detry, R., Pugeault, N., Piater, J.: A probabilistic framework for 3D visual object representation. IEEE Trans. PAMI 31, 1790–1803 (2009)
Google Scholar
Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., Gool, L.V.: Towards multi-view object class detection. In: CVPR (2006)
Google Scholar
Liebelt, J., Schmid, C., Schertler, K.: Viewpoint-independent object class detection using 3d feature maps. In: CVPR (2008)
Google Scholar
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: NIPS (2009)
Google Scholar
Lampert, C.: Partitioning of image datasets using discriminative context information. In: CVPR, pp. 1–8 (2008)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. PAMI 22, 888–905 (2000)
Google Scholar
Aiolli, F., Sperduti, A.: Multiclass classification with multi-prototype support vector machines. Journal of Machine Learning Research (2005)
Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Rosenhahn, B., Brox, T., Weickert, J.: Three-dimensional shape knowledge for joint image segmentation and pose tracking. Int’l. J. Comp. Vision 73, 243–262 (2007)
Article Google Scholar
Ozuysal, M., Lepetit, V., Fua, P.: Pose estimation for category specific multiview object localization. In: CVPR (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California at Berkeley, Berkeley, CA, 94720, USA
Chunhui Gu
Intel Labs Seattle, 1100 NE 45th Street, Seattle, WA, 98105, USA
Xiaofeng Ren

Authors

Chunhui Gu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Ren
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
National Technical University of Athens, School of Electrical and Computer Engineering, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gu, C., Ren, X. (2010). Discriminative Mixture-of-Templates for Viewpoint Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15555-0_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-15555-0_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15554-3
Online ISBN: 978-3-642-15555-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discriminative Mixture-of-Templates for Viewpoint Classification

Abstract

Chapter PDF

Similar content being viewed by others

Untangling Object-View Manifold for Multiview Recognition and Pose Estimation

Viewpoint Estimation—Insights and Model

Efficient 2D viewpoint combination for human action recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Discriminative Mixture-of-Templates for Viewpoint Classification

Abstract

Chapter PDF

Similar content being viewed by others

Untangling Object-View Manifold for Multiview Recognition and Pose Estimation

Viewpoint Estimation—Insights and Model

Efficient 2D viewpoint combination for human action recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation