Abstract
We present an approach for learning to detect objects in still gray images, that is based on a sparse, part-based representation of objects. A vocabulary of information-rich object parts is automatically constructed from a set of sample images of the object class of interest. Images are then represented using parts from this vocabulary, along with spatial relations observed among them. Based on this representation, a feature-efficient learning algorithm is used to learn to detect instances of the object class. The framework developed can be applied to any object with distinguishable parts in a relatively fixed spatial configuration. We report experiments on images of side views of cars. Our experiments show that the method achieves high detection accuracy on a difficult test set of real-world images, and is highly robust to partial occlusion and background variation.
In addition, we discuss and offer solutions to several methodological issues that are significant for the research community to be able to evaluate object detection approaches.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Logothetis, N.K., Sheinberg, D.L.: Visual object recognition. Ann. Rev. Neurosci. 19 (1996) 577–621
Palmer, S.E.: Hierarchical structure in perceptual representation. Cognitive Psychology 9 (1977) 441–474
Wachsmuth, E., Oram, M.W., Perrett, D.I.: Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cerebral Cortex 4 (1994)
Biederman, I.: Recognition by components: a theory of human image understanding. Psychol. Review 94 (1987) 115–147
Colmenarez, A.J., Huang, T.S.: Face detection with information-based maximum discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (1997) 782–787
Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) 23–38
Osuna, E., Freund, R., Girosi, F.: Training support vector machines: an application to face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (1997) 130–136
Amit, Y., Geman, D.: A computational model for visual selection. Neural Computation 11 (1999) 1691–1715
Roth, D., Yang, M.H., Ahuja, N.: Learning to recognize 3d objects. Neural Computation 14 (2002) To appear
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2001)
Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Trans. on Pattern Analysis and Machine Intelligence 23 (2001) 349–361
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Proceedings of the Sixth European Conference on Computer Vision. (2000) 18–32
Yang, M.H., Roth, D., Ahuja, N.: A SNoW-based face detector. In Solla, S.A., Leen, T.K., Müller, K.R., eds.: Advances in Neural Information Processing Systems 12. (2000) 855–861
Moravec, H.P.: Towards automatic visual obstacle avoidance. In: Proceedings of the Fifth International Joint Conference on Artificial Intelligence. (1977)
Schmid, C., Mohr, R.: Local greyvalue invariants for image retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence 19 (1997) 530–535
Haralick, R.M., Shapiro, L.G.: Computer and Robot Vision II. Addison-Wesley (1993)
Grimson, W.E.L., Lozano-Perez, T.: Recognition and localization of overlapping parts from sparse data in two and three dimensions. In: Proceedings of the IEEE International Conference on Robotics and Automation. (1985) 61–66
Schneiderman, H., Kanade, T.: A statistical method for 3D object detection applied to faces and cars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Volume 1.(2000) 746–751
Carlson, A.J., Cumby, C., Rosen, J., Roth, D.: The SNoW learning architecture. Technical Report UIUCDCS-R-99-2101, UIUC Computer Science Department (1999)
Roth, D.: Learning to resolve natural language ambiguities: A unified approach. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence. (1998) 806–813
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Agarwal, S., Roth, D. (2002). Learning a Sparse Representation for Object Detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47979-1_8
Download citation
DOI: https://doi.org/10.1007/3-540-47979-1_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43748-2
Online ISBN: 978-3-540-47979-6
eBook Packages: Springer Book Archive