Abstract
The tasks of visual object recognition and classification are natural and effortless for biological visual systems, but exceedingly difficult to replicate in computer vision systems. This difficulty arises from the large variability in images of different objects within a class, and variability in viewing conditions. In this paper we describe a fragment-based method for object classification. In this approach objects within a class are represented in terms of common image fragments, that are used as building blocks for representing a large variety of different objects that belong to a common class, such as a face or a car. Optimal fragments are selected from a training set of images based on a criterion of maximizing the mutual information of the fragments and the class they represent. For the purpose of classification the fragments are also organized into types, where each type is a collection of alternative fragments, such as different hairline or eye regions for face classification. During classification, the algorithm detects fragments of the different types, and then combines the evidence for the detected fragments to reach a final decision. The algorithm verifies the proper arrangement of the fragments and the consistency of the viewing conditions primarily by the conjunction of overlapping fragments. The method is different from previous part-based methods in using class-specific overlapping object fragments of varying complexity, and in verifying the consistent arrangement of the fragments primarily by the conjunction of overlapping detected fragments. Experimental results on the detection of face and car views show that the fragment-based approach can generalize well to completely novel image views within a class while maintaining low mis-classification error rates. We briefly discuss relationships between the proposed method and properties of parts of the primate visual system involved in object perception.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
ASP associative processor, http://www.asp.co.il
Amit Y., Geman D., Wilder K., “Joint Induction of Shape Features and Tree Classifiers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 11, November 1997. **
Bhat D., Nayar K. S., “Ordinal measures for image correspondence”, IEEE Trans. on PAMI Vol. 20 No. 4, (1998) 415–423.
Biederman I., “Human image understanding: recent research and theory”, Computer Vision, Graphics and Image Processing, (1985) 32:29–73.
Binford T. O. “Visual perception by computer”, IEEE conf. on systems and control 1971.
Brooks R., “Symbolic reasoning among 3-D models and 2-D images”, Artificial intelligence (17) (1981) 285–348.
Cootes T.F., Taylor C.J., Cooper D.H., Graham J., Active shape models — their training and applications. Computer Vision and Image Understanding, 61 (1995) 38–59.
Cover, T.M. & Thomas, J.A. Elements of Information Theory. Wiley Series in Telecommunication, New York, 1991.
Edelman, S. Representing 3D objects by sets of activities of receptive fields 70, 37–45. Biological cybernitics, 70, (1993) 37–45.
Grimson W. E. L., Recognition of Object Families Using Parametrized Models, Proc. First International Conference on Computer Vision, (1987) 93–101.
Grimson, E.W.L., & Lozano-Perez, T. Localizing overlapping parts by searching the interpretation tree. IEEE Trans. On Pattern Analysis and Machine Intelligence, 9 (1987) 469–482.
Hubel, D. H., Wiesel, T. N. “Receptive fields and functional architecture of monkey striate cortex”, Journal of physiology, 195 (1968) 215–243.
Logothetis N. K., Pauls J., Bülthoff H. H., Poggio T., “View-dependent object recognition in monkeys”, Current biology, 4 (1994) 401–414.
Marr D., Vision, W.H. Freeman, San Francisco CA, 1982.
Marr D., Nishihara H. K. “Representation and recognition of the spatial organization of three dimensional structure” Proceedings of the Royal Society of London B, 200 (1978) 269–294.
Mel W. B., SEEMORE: “Combining color, shape and texture histogramming in a neurally inspired approach to visual object recognition”, Neural computation 9 (1997) 777–804.
Minsky M. and Papert S., Perceptrons, The MIT Press, Cambridge Massachusetts, 1969.
Miyashita, Y. & Chang, H.S. Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature, 331, (1988) 68–70.
Murase, H. & Nayar, S.K. Visual learning and recognition of 3-D objects from appearance. International J. of Com. Vision, 14 (1995) 5–24.
Nelson C. R., and Selinger A., “A Cubist approach to object recognition”, ICCV-98 (1998) 614–621.
Perret D. I., Rolls E. T. Caan W., “Visual neurons responsive to faces in the monkey temporal cortex”, Experimental brain research, 47 (1982) 329–342.
Poggio T. and Sung K., “Finding human faces with a gaussian mixture distribution-base face model”, Computer analysis of image and patterns (1995) 432–439.
Poggio, T. & Edelman, S. A network that learns to recognize three-dimensional objects. Nature, 343 (1990) 263–266.
Rolls E. T., “Neurons in the cortex of the temporal lobe and in the amygdala of the monkey with responses selective for faces”, Human neurobiology, 3 (1984) 209–222.
Rosch, E. Mervis, C.B., Gray, W.D., Johnson, S.M. & Boyes-Braem, P. Basic objects in natural carogories. Cognitive Psychology, 8 (1976) 382–439.
Tanaka, K., “Neural mechanisms of object recognition”, Science, Vol. 262 (1993) 685–688.
Turk M. and Pentland A., “Eigenfaces for recognition”, Cognitive Neuroscience, 3 (1990) 71–86.
Ullman, S. & Basri, R. Recognition by linear combination of models. IEEE PAMI, 13(10) (1991) 992–1006.
Vapnik, V. The Nature of Statistical Learning Theory. Springer, New York, 1995.
von der Heydt R., Peterhans E., Baumgartner G., “Illusory contours and cortical neuron responses”, Science, 224 (1984) 1260–1262.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin-Heidelberg
About this paper
Cite this paper
Ullman, S., Sali, E. (2000). Object Classification Using a Fragment-Based Representation. In: Lee, SW., Bülthoff, H.H., Poggio, T. (eds) Biologically Motivated Computer Vision. BMCV 2000. Lecture Notes in Computer Science, vol 1811. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45482-9_8
Download citation
DOI: https://doi.org/10.1007/3-540-45482-9_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67560-0
Online ISBN: 978-3-540-45482-3
eBook Packages: Springer Book Archive