Sparselet Models for Efficient Multiclass Object Detection

Song, Hyun Oh; Zickler, Stefan; Althoff, Tim; Girshick, Ross; Fritz, Mario; Geyer, Christopher; Felzenszwalb, Pedro; Darrell, Trevor

doi:10.1007/978-3-642-33709-3_57

Hyun Oh Song²¹,
Stefan Zickler²²,
Tim Althoff²¹,
Ross Girshick²³,
Mario Fritz²⁴,
Christopher Geyer²²,
Pedro Felzenszwalb²⁵ &
…
Trevor Darrell²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7573))

Included in the following conference series:

European Conference on Computer Vision

11k Accesses
31 Citations

Abstract

We develop an intermediate representation for deformable part models and show that this representation has favorable performance characteristics for multi-class problems when the number of classes is high. Our model uses sparse coding of part filters to represent each filter as a sparse linear combination of shared dictionary elements. This leads to a universal set of parts that are shared among all object classes. Reconstruction of the original part filter responses via sparse matrix-vector product reduces computation relative to conventional part filter convolutions. Our model is well suited to a parallel implementation, and we report a new GPU DPM implementation that takes advantage of sparse coding of part filters. The speed-up offered by our intermediate representation and parallel computation enable real-time DPM detection of 20 different object classes on a laptop computer.

Download to read the full chapter text

Chapter PDF

Parallelized deformable part models with effective hypothesis pruning

Article Open access 15 June 2016

Feature Reduction for Efficient Object Detection via L1-norm Latent SVM

An Efficient Shape Feature Extraction, Description and Matching Method Using GPU

Keywords

References

Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A.: Cascade object detection with deformable part models. In: CVPR (2010)
Google Scholar
Pedersoli, M., Vedaldi, A., Gonzàlez, J.: A coarse-to-fine approach for fast deformable object detection. In: CVPR (2011)
Google Scholar
Ott, P., Everingham, M.: Shared parts for deformable part-based models. In: CVPR, pp. 1513–1520 (2011)
Google Scholar
Pirsiavash, H., Ramanan, D., Fowlkes, C.: Bilinear classifiers for visual recognition. In: NIPS (2009)
Google Scholar
Quattoni, A., Collins, M., Darrell, T.: Transfer learning for image classification with sparse prototype representations. In: CVPR (2008)
Google Scholar
Fritz, M., Schiele, B.: Decomposition, discovery and detection of visual categories using topic models. In: CVPR (2008)
Google Scholar
Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: CVPR (2008)
Google Scholar
Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: NIPS (2010)
Google Scholar
Binder, A., Müller, K.R., Kawanabe, M.: On taxonomies for multi-class image categorization. International Journal of Computer Vision 99(3), 281–301 (2012)
Article MathSciNet Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: A scalable tree-based approach for joint object and pose recognition. In: Twenty-Fifth Conference on Artificial Intelligence (AAAI) (August 2011)
Google Scholar
Razavi, N., Gall, J., Gool, L.J.V.: Scalable multi-class object detection. In: CVPR, pp. 1505–1512 (2011)
Google Scholar
Marszałek, M., Schmid, C.: Constructing Category Hierarchies for Visual Recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 479–491. Springer, Heidelberg (2008)
Chapter Google Scholar
Gao, T., Koller, D.: Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: ICCV (2011)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Cotter, S.F., Rao, B.D., Kreutz-Delgado, K., Adler, J.: Forward sequential algorithms for best basis selection. IEEE Proceedings Vision Image and Signal Processing 146(5), 235 (1999)
Article Google Scholar
Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing 41(12), 3397–3415 (1993)
Article MATH Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research 11, 19–60 (2010)
MATH MathSciNet Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
NVIDIA: CUDA Technology, http://www.nvidia.com/CUDA
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)
Google Scholar
Amazon Mechanical Turk, http://www.mturk.com
Song, H.O., Fritz, M., Althoff, T., Darrell, T.: Don’t look back: Post-hoc category detection via sparse reconstruction. Technical Report UCB/EECS-2012-16, EECS Department, University of California, Berkeley (January 2012)
Google Scholar

Download references

Author information

Authors and Affiliations

UC Berkeley, USA
Hyun Oh Song, Tim Althoff & Trevor Darrell
iRobot, USA
Stefan Zickler & Christopher Geyer
University of Chicago, USA
Ross Girshick
Max Planck Institute for Informatics, Germany
Mario Fritz
Brown University, USA
Pedro Felzenszwalb

Authors

Hyun Oh Song
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Zickler
View author publications
You can also search for this author in PubMed Google Scholar
Tim Althoff
View author publications
You can also search for this author in PubMed Google Scholar
Ross Girshick
View author publications
You can also search for this author in PubMed Google Scholar
Mario Fritz
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Geyer
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Felzenszwalb
View author publications
You can also search for this author in PubMed Google Scholar
Trevor Darrell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd, CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, H.O. et al. (2012). Sparselet Models for Efficient Multiclass Object Detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7573. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33709-3_57

Download citation

DOI: https://doi.org/10.1007/978-3-642-33709-3_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33708-6
Online ISBN: 978-3-642-33709-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sparselet Models for Efficient Multiclass Object Detection

Abstract

Chapter PDF

Similar content being viewed by others

Parallelized deformable part models with effective hypothesis pruning

Feature Reduction for Efficient Object Detection via L1-norm Latent SVM

An Efficient Shape Feature Extraction, Description and Matching Method Using GPU

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Sparselet Models for Efficient Multiclass Object Detection

Abstract

Chapter PDF

Similar content being viewed by others

Parallelized deformable part models with effective hypothesis pruning

Feature Reduction for Efficient Object Detection via L1-norm Latent SVM

An Efficient Shape Feature Extraction, Description and Matching Method Using GPU

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation