Abstract
This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits novel features, based on textons, which jointly model shape and texture. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating these classifiers in a conditional random field. Efficient training of the model on very large datasets is achieved by exploiting both random feature selection and piecewise training methods.
High classification and segmentation accuracy are demonstrated on three different databases: i) our own 21-object class database of photographs of real objects viewed under general lighting conditions, poses and viewpoints, ii) the 7-class Corel subset and iii) the 7-class Sowerby database used in [1]. The proposed algorithm gives competitive results both for highly textured (e.g. grass, trees), highly structured (e.g. cars, faces, bikes, aeroplanes) and articulated objects (e.g. body, cow).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
He, X., Zemel, R.S., Carreira-Perpiñán, M.A.: Multiscale conditional random fields for image labeling. In: Proc. of IEEE CVPR (2004)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR 2003, vol. II, pp. 264–271 (2003)
Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondences. In: CVPR (2005)
Winn, J., Criminisi, A., Minka, T.: Categorization by learned universal visual dictionary. In: Int. Conf. of Computer Vision (2005)
Kumar, S., Herbert, M.: Discriminative fields for modeling spatial dependencies in natural images. In: NIPS (2004)
Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: Proceedings IEEE workshop on Perceptual Organization in Computer Vision, CVPR (2004)
Winn, J., Jojic, N.: LOCUS: Learning Object Classes with Unsupervised Segmentation. In: Proc. of IEEE ICCV (2005)
Kumar, P., Torr, P., Zisserman, A.: Obj cut. In: Proc. of IEEE CVPR (2005)
Leibe, B., Schiele, B.: Interleaved object categorization and segmentation. In: BMVC 2003, vol. II, pp. 264–271 (2003)
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Tu, Z., Chen, X., Yuille, A.L., Zhu, S.: Image parsing: Unifying segmentation, detection, and recognition. In: CVPR (2003)
Konishi, S., Yuille, A.L.: Statistical cues for domain specific image segmentation with performance analysis. In: CVPR (2000)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML (2001)
Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In: Proc. of IEEE ICCV (2001)
Rother, C., Kolmogorov, V., Blake, A.: Interactive foreground extraction using iterated graph cuts. In: ACM Transactions on Graphics, SIGGRAPH 2004 (2004)
Sutton, C., McCallum, A.: Piecewise training of undirected models. In: 21st Conference on Uncertainty in Artificial Intelligence (2005)
Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. IJCV 43, 29–44 (2001)
Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. International Journal of Computer Vision: Special Issue on Texture Analysis and Synthesis 62, 61–81 (2005)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR 2001, vol. I, pp. 511–518 (2001)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI 24, 509–522 (2002)
Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proc. of IEEE CVPR, pp. 762–769 (2004)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Technical report, Dept. of Statistics, Stanford University (1998)
Baluja, S., Rowley, H.A.: Boosting sex identification performance, pp. 1508–1513. AAAI Press, Menlo Park (2005)
Kumar, S., Hebert, M.: A hierarchical field framework for unified context-based classification. In: ICCV 2005, vol. II, pp. 1284–1291 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shotton, J., Winn, J., Rother, C., Criminisi, A. (2006). TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol 3951. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11744023_1
Download citation
DOI: https://doi.org/10.1007/11744023_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33832-1
Online ISBN: 978-3-540-33833-8
eBook Packages: Computer ScienceComputer Science (R0)