Abstract
While image registration has been studied in different areas of computer vision, aligning images depicting different scenes remains a challenging problem, closer to recognition than to image matching. Analogous to optical flow, where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image to its neighbors in a large image collection consisting of a variety of scenes. For a query image, histogram intersection on a bag-of-visual-words representation is used to find the set of nearest neighbors in the database. The SIFT flow algorithm then consists of matching densely sampled SIFT features between the two images, while preserving spatial discontinuities. The use of SIFT features allows robust matching across different scene/object appearances and the discontinuity-preserving spatial model allows matching of objects located at different parts of the scene. Experiments show that the proposed approach is able to robustly align complicated scenes with large spatial distortions. We collect a large database of videos and apply the SIFT flow algorithm to two applications: (i) motion field prediction from a single static image and (ii) motion synthesis via transfer of moving objects.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. In: Proc. ICCV (2007)
Szeliski, R.: Image alignment and stiching: A tutorial. Foundations and Trends in Computer Graphics and Computer Vision 2(1) (2006)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Intl. J. of Computer Vision 47(1), 7–42 (2002)
Horn, B.K.P., Schunck, B.G.: Determing optical flow. Artificial Intelligence 17, 185–203 (1981)
Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–679 (1981)
Bruhn, A., Weickert, J., Schnörr, C.: Lucas/Kanade meets Horn/Schunk: combining local and global optic flow methods. Intl. J. of Computer Vision 61(3), 211–231 (2005)
Belongie, S., Malik, J., Puzicha, J.: Shape context: A new descriptor for shape matching and object recognition. In: NIPS (2000)
Berg, A., Berg, T., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: Proc. CVPR (2005)
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Intl. J. of Computer Vision 61(1) (2005)
Winn, J., Jojic, N.: Locus: Learning object classes with unsupervised segmentation. In: Proc. ICCV, pp. 756–763 (2005)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Intl. J. of Computer Vision 77(1-3), 157–173 (2008)
Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH 2007) 26(3) (2007)
Russell, B.C., Torralba, A., Liu, C., Fergus, R., Freeman, W.T.: Object recognition by scene alignment. In: NIPS (2007)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. ICCV 1999, Kerkyra, Greece, pp. 1150–1157 (1999)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR, vol. II, pp. 2169–2178 (2006)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proc. ICCV (2003)
Grauman, K., Darrell, T.: Pyramid match kernels: Discriminative classification with sets of image features. In: Proc. ICCV (2005)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. on Pattern Analysis and Machine Intelligence 23(11), 1222–1239 (2001)
Shekhovtsov, A., Kovtun, I., Hlavac, V.: Efficient MRF deformation model for non-rigid image matching. In: Proc. CVPR (2007)
Wainwright, M., Jaakkola, T., Willsky, A.: Exact MAP estimates by (hyper)tree agreement. In: NIPS (2003)
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024, pp. 25–36. Springer, Heidelberg (2004)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Intl. J. of Computer Vision 70(1), 41–54 (2006)
Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. 22(3), 313–318 (2003)
Lalonde, J.F., Hoiem, D., Efros, A.A., Rother, C., Winn, J., Criminisi, A.: Photo clip art. ACM Transactions on Graphics (SIGGRAPH 2007) 26(3) (August 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T. (2008). SIFT Flow: Dense Correspondence across Different Scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88690-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-88690-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88689-1
Online ISBN: 978-3-540-88690-7
eBook Packages: Computer ScienceComputer Science (R0)