Abstract
Bottom-up segmentation based only on low-level cues is a notoriously difficult problem. This difficulty has lead to recent top-down segmentation algorithms that are based on class-specific image information. Despite the success of top-down algorithms, they often give coarse segmentations that can be significantly refined using low-level cues. This raises the question of how to combine both top-down and bottom-up cues in a principled manner.
In this paper we approach this problem using supervised learning. Given a training set of ground truth segmentations we train a fragment-based segmentation algorithm which takes into account both bottom-up and top-down cues simultaneously, in contrast to most existing algorithms which train top-down and bottom-up modules separately. We formulate the problem in the framework of Conditional Random Fields (CRF) and derive a novel feature induction algorithm for CRF, which allows us to efficiently search over thousands of candidate fragments. Whereas pure top-down algorithms often require hundreds of fragments, our simultaneous learning procedure yields algorithms with a handful of fragments that are combined with low-level cues to efficiently compute high quality segmentations.
Chapter PDF
Similar content being viewed by others
Keywords
- Conditional Random Field
- Image Fragment
- Ground Truth Segmentation
- Marginal Vector
- Generalize Belief Propagation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Barbu, A., Zhu, S.C.: Graph partition by swendsen-wang cut. In: Proceedings of the IEEE International Conference on Computer Vision (2003)
Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop on Perceptual Organization in Computer Vision (June 2004)
Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)
He, X., Zemel, R., Carreira-Perpi, M.: Multiscale conditional random fields for image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2004)
Pawan Kumar, M., Torr, P.H.S., Zisserman, A.: Objcut. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2004)
Kumar, S., HebertMultiscale, M.: Discriminative random fields: A discriminative framework for contextual interaction in classification. In: Proceedings of the IEEE International Conference on Computer Vision (2003)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)
LeCun, Y., Huang, F.J.: Loss functions for discriminative training of energy-based models. In: Proc. of the 10th International Workshop on Artificial Intelligence and Statistics, AIStats 2005 (2005)
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Proceedings of the Workshop on Statistical Learning in Computer Vision, Prague, Czech Republic (May 2004)
Malik, J., Belongie, S., Leung, T., Shi, J.: Contour and texture analysis for image segmentation. In: Boyer, K.L., Sarkar, S. (eds.) Perceptual Organization for artificial vision systems, Kluwer Academic, Dordrecht (2000)
Sharon, E., Brandt, A., Basri, R.: Segmentation and boundary detection using multiscale intensity measurements. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2001)
Tu, Z.W., Chen, X.R., Yuille, A.L., Zhu, S.C.: Image parsing: segmentation, detection, and recognition. In: Proceedings of the IEEE International Conference on Computer Vision (2003)
Wainwright, M.J., Jaakkola, T., Willsky, A.S.: Tree-reweighted belief propagation and approximate ml estimation by pseudo-moment matching. In: 9th Workshop on Artificial Intelligence and Statistics (2003)
Yedidia, J.S., Freeman, W.T., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory 51, 2282–2312 (2005)
Yu, S.X., Shi, J.: Object-specific figure-ground segregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2003)
Yuille, A., Hallinan, P.: Deformable templates. In: Blake, A., Yuille, A. (eds.) Active Vision, MIT Press, Cambridge (2002)
Zhu, S.C., Wu, Z.N., Mumford, D.: Minimax entropy principle and its application to texture modeling. Neural Computation 9(8), 1627–1660 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Levin, A., Weiss, Y. (2006). Learning to Combine Bottom-Up and Top-Down Segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol 3954. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11744085_45
Download citation
DOI: https://doi.org/10.1007/11744085_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33838-3
Online ISBN: 978-3-540-33839-0
eBook Packages: Computer ScienceComputer Science (R0)