Abstract
Long scenes can be imaged by mosaicing multiple images from cameras scanning the scene. We address the case of a video camera scanning a scene while moving in a long path, e.g. scanning a city street from a driving car, or scanning a terrain from a low flying aircraft.
A robust approach to this task is presented, which is applied successfully to sequences having thousands of frames even when using a hand-held camera. Examples are given on a few challenging sequences. The proposed system consists of two components: (i) Motion and depth computation. (ii) Mosaic rendering.
In the first part a “direct” method is presented for computing motion and dense depth. Robustness of motion computation has been increased by limiting the motion model for the scanning camera. An iterative graph-cuts approach, with planar labels and a flexible similarity measure, allows the computation of a dense depth for the entire sequence.
In the second part a new minimal aspect distortion (MAD) mosaicing uses depth to minimize the geometrical distortions of long panoramic images. In addition to MAD mosaicing, interactive visualization using X-Slits is also demonstrated.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., & Cohen, M. (2004). Interactive digital photomontage. In SIGGRAPH (pp. 294–302).
Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., & Szeliski, R. (2006a). Photographing long scenes with multi-viewpoint panoramas. ACM Transactions on Graphics, 25(3), 853–861.
Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., & Szeliski, R. (2006b). Photographing long scenes with multi-viewpoint panoramas. In SIGGRAPH’06 (pp. 853–861), July 2006.
Bergen, J., Anandan, P., Hanna, K., & Hingorani, R. (1992). Hierarchical model-based motion estimation. In ECCV (pp. 237–252).
Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(4), 401–406.
Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In ICCV (Vol. 1, pp. 489–495).
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Deng, Y., Yang, Q., Lin, X., & Tang, X. (2005). A symmetric patch-based correspondence model for occlusion handling. In ICCV (pp. 1316–1322), Washington, DC, USA.
Feldman, D., & Zomet, A. (2004). Generating mosaics with minimum distortions. In Proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW’04) (Vol. 11, pp. 163–170), Washington, DC, USA.
Felzenszwalb, P., & Huttenlocher, D. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54.
Gortler, S., Grzeszczuk, R., Szeliski, R., & Cohen, M. (1996). The lumigraph. SIGGRAPH, 30, 43–54.
Hanna, K. (1991). Direct multi-resolution estimation of ego-motion and structure from motion. In MOTION’91 (pp. 156–162).
Hartley, R., & Zisserman, A. (2004). Multiple view geometry (2nd ed.). Cambridge: Cambridge University Press.
Hong, L., & Chen, G. (2004). Segment-based stereo matching using graph cuts. In CVPR (Vol. 1, pp. 74–81), Los Alamitos, CA, USA.
Irani, M., & Peleg, S. (1993). Motion analysis for image enhancement: Resolution, occlusion, and transparency. Journal of Visual Communication and Image Representation, 4, 324–335.
Irani, M., Rousso, B., & Peleg, S. (1992). Detecting and tracking multiple moving objects using temporal integration. In ECCV’92 (pp. 282–287).
Irani, M., Anandan, P., & Cohen, M. (2002). Direct recovery of planar-parallax from multiple frames. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(11), 1528–1534.
Kawasaki, H., Murao, M., Ikeuchi, K., & Sakauchi, M. (2001). Enhanced navigation system with real images and real-time information. In ITSWC’01, October 2001.
Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions via graph cuts. In ICCV (Vol. 2, pp. 508–515), July 2001.
Kolmogorov, V., & Zabih, R. (2002). What energy functions can be minimized via graph cuts? In ECCV’02 (pp. 65–81), May 2002.
Levoy, M., & Hanrahan, P. (1996). Light field rendering. SIGGRAPH, 30, 31–42.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Montoliu, R., & Pla, F. (2003). Robust techniques in least squares-based motion estimation problems. In Lecture notes in computer science : Vol. 2905. Progress in pattern recognition, speech and image analysis (pp. 62–70). Berlin: Springer.
Ono, S., Kawasaki, H., Hirahara, K., Kagesawa, M., & Ikeuchi, K. (2003). Ego-motion estimation for efficient city modeling by using epipolar plane range image. In ITSWC’03, November 2003.
Pollefeys, M., VanGool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., & Koch, R. (2004). Visual modeling with a hand-held camera. International Journal of Computer Vision, 59(3), 207–232.
Rav-Acha, A., & Peleg, S. (2004). A unified approach for motion analysis and view synthesis. In Second IEEE international symposium on 3D data processing, visualization, and transmission (3DPVT), Thessaloniki, Greece, September 2004.
Rav-Acha, A., & Peleg, S. (2006). Lucas–Kanade without iterative warping. In ICIP’06 (pp. 1097–1100).
Rav-Acha, A., Shor, Y., & Peleg, S. (2004). Mosaicing with parallax using time warping. In Second IEEE workshop on image and video registration, Washington, DC, July 2004.
Román, A., & Lensch, H. P. A. (2006). Automatic multiperspective images. In Proceedings of eurographics symposium on rendering (pp. 161–171).
Román, A., Garg, G., & Levoy, M. (2004). Interactive design of multi-perspective images for visualizing urban landscapes. In IEEE visualization 2004 (pp. 537–544), October 2004.
Shi, M., & Zheng, J. Y. (2005). A slit scanning depth of route panorama from stationary blur. In CVPR’05 (Vol. 1, pp. 1047–1054).
Wexler, Y., & Simakov, D. (2005). Space–time scene manifolds. In ICCV’05 (Vol. 1, pp. 858–863).
Xiao, J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644–1659.
Yang, Q., Wang, L., & Yang, R. (2006). Real-time global stereo matching using hierarchical belief propagation. In BMVC (pp. 989–998), Edinburgh, September 2006.
Zheng, J. Y. (2000). Digital route panorama. IEEE Multimedia, 7(2), 7–10.
Zhu, Z., Riseman, E., & Hanson, A. (2004). Generalized parallel-perspective stereo mosaics from airborne videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 226–237.
Zomet, A., Feldman, D., Peleg, S., & Weinshall, D. (2003). Mosaicing new views: the crossed-slits projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(6), 741–754.
Author information
Authors and Affiliations
Additional information
This research was supported by the Israel Science Foundation. Video examples and high resolution images can be viewed in http://www.vision.huji.ac.il/mad/.
Rights and permissions
About this article
Cite this article
Rav-Acha, A., Engel, G. & Peleg, S. Minimal Aspect Distortion (MAD) Mosaicing of Long Scenes. Int J Comput Vis 78, 187–206 (2008). https://doi.org/10.1007/s11263-007-0101-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0101-9