Abstract
Unsupervised learning requires a grouping step that defines which data belong together. A natural way of grouping in images is the segmentation of objects or parts of objects. While pure bottom-up segmentation from static cues is well known to be ambiguous at the object level, the story changes as soon as objects move. In this paper, we present a method that uses long term point trajectories based on dense optical flow. Defining pair-wise distances between these trajectories allows to cluster them, which results in temporally consistent segmentations of moving objects in a video shot. In contrast to multi-body factorization, points and even whole objects may appear or disappear during the shot. We provide a benchmark dataset and an evaluation method for this so far uncovered setting.
This work was supported by the German Academic Exchange Service (DAAD) and ONR MURI N00014-06-1-0734.
Chapter PDF
Similar content being viewed by others
References
Spelke, E.: Principles of object perception. Cognitive Science 14, 29–56 (1990)
Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. In: European Conf. on Computer Vision. LNCS, Springer, Heidelberg (2010)
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence (to appear)
Sand, P., Teller, S.: Particle video: long-range motion estimation using point trajectories. International Journal of Computer Vision 80, 72–91 (2008)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems (2002)
Sivic, J., Schaffalitzky, F., Zisserman, A.: Object level grouping for video shots. International Journal of Computer Vision 67, 189–210 (2006)
Tron, R., Vidal, R.: A benchmark for the comparison of 3-D motion segmentation algorithms. In: Int. Conf. on Computer Vision and Pattern Recognition (2007)
Koffka, K.: Principles of Gestalt Psychology. Hartcourt Brace Jovanovich, New York (1935)
Wang, J.Y.A., Adelson, E.H.: Representing moving images with layers. IEEE Transactions on Image Processing 3, 625–638 (1994)
Weiss, Y.: Smoothness in layers: motion segmentation using nonparametric mixture estimation. In: Int. Conf. on Computer Vision and Pattern Recognition, pp. 520–527 (1997)
Shi, J., Malik, J.: Motion segmentation and tracking using normalized cuts. In: Proc. 6th International Conference on Computer Vision, Bombay, India, pp. 1154–1160 (1998)
Cremers, D., Soatto, S.: Motion competition: A variational framework for piecewise parametric motion segmentation. International Journal of Computer Vision 62, 249–265 (2005)
Xiao, J., Shah, M.: Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1644–1659 (2005)
Pawan Kumar, M., Torr, P., Zisserman, A.: Learning layered motion segmentations of video. International Journal of Computer Vision 76, 301–319 (2008)
Smith, P., Drummond, T., Cipolla, R.: Layered motion segmentation and depth ordering by tracking edges. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 479–494 (2004)
Costeira, J., Kanande, T.: A multi-body factorization method for motion analysis. In: Int. Conf. on Computer Vision, pp. 1071–1076 (1995)
Yan, J., Pollefeys, M.: A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 94–106. Springer, Heidelberg (2006)
Rao, S.R., Tron, R., Vidal, R., Ma, Y.: Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories. In: Int. Conf. on Computer Vision and Pattern Recognition (2008)
Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: Int. Conf. on Computer Vision and Pattern Recognition (2009)
Brostow, G., Cipolla, R.: Unsupervised Bayesian detection of independent motion in crowds. In: Int. Conf. on Computer Vision and Pattern Recognition (2006)
Cheriyadat, A., Radke, R.: Non-negative matrix factorization of partial track data for motion segmentation. In: Int. Conf. on Computer Vision (2009)
Fradet, M., Robert, P., Pérez, P.: Clustering point trajectories with various life-spans. In: Proc. European Conference on Visual Media Production (2009)
Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 110–123. Springer, Heidelberg (2006)
Belongie, S., Malik, J.: Finding boundaries in natural images: A new method using point descriptors and area completion. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 751–766. Springer, Heidelberg (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brox, T., Malik, J. (2010). Object Segmentation by Long Term Analysis of Point Trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15555-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-15555-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15554-3
Online ISBN: 978-3-642-15555-0
eBook Packages: Computer ScienceComputer Science (R0)