Abstract
A new algorithm is proposed for efficient stereo and novel view synthesis. Given the video streams acquired by two synchronized cameras the proposed algorithm synthesises images from a virtual camera in arbitrary position near the physical cameras. The new technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel view generation. The two main contributions of this paper are: (i) a new four state matching graph for dense stereo dynamic programming, that supports accurate occlusion labelling; (ii) a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly in the three-dimensional matching cost space.
The proposed algorithm has been designed to work with input images with large disparity range, a common practical situation. The enhanced occlusion handling capabilities of the new dynamic programming algorithm are evaluated against those of the most powerful state-of-the-art dynamic programming and graph-cut techniques. Four-state DP is also evaluated against the disparity-based Middlebury error metrics and its performance found to be amongst the best of the efficient algorithms. A number of examples demonstrate the robustness of four-state DP to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, finally, basic 3D scene editing.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Buehler, C., Gortler, S., Cohen, M., and McMillan, L. 2002. Min surfaces for stereo. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark.
Belhumeur, P.N. and Mumford, D. 1992. A Bayesian treatment of the stereo correspondence problem using half-occluded regions. In IEEE Comp. Soc. Conf. on Comp. Vision and Pattern Recognition, pp. 506–512.
Cox, I.J., Hingorani, S.L., Rao, S.B., and Maggs, B.M. 1996. A maximum-likelihood stereo algorithm. Computer Vision and Image Understanding, 63(3):542–567.
Cox, I., Ott, M., and Lewis, J.P. 1993. Videoconference system using a virtual camera image. US Patent, 5,359,362.
Criminisi, A., Shotton, J., Blake, A., and Torr, P. 2003. Gaze manipulation for one-to-one teleconferencing. In Proc. International Conference on Computer Vision, Nice.
Chen, E. and Williams, L. 1993. View interpolation for image synthesis. In SIGGRAPH, pp. 279–288.
Gemmell, J., Toyama, K., Zitnick, C., Kang, T., and Seitz, S. 2000. Gaze awareness for video-conferencing: A software approach. IEEE Multimedia, 7(4).
Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049.
Ishikawa, H. and Geiger, D. 1998. Occlusions, discontinuities, and epipolar lines in stereo. In European Conference on Computer Vision, Freiburg, Germany, pp. 232–248.
Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2005. Bi-layer segmentation of binocular stereo video. In Computer Vision and Pattern Recognition (CVPR). Best Paper Honorable Mention Award, San Diego.
Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2006. Probabilistic fusion of stereo with color and contrast in bi-layer segmentation. Pattern Analysis and Machine Intelligence (PAMI) (In press).
Kolmogorov, V. and Zabih, R. 2001. Computing visual correspondence with occlusions using graph cuts. In International Conference on Computer Vision, Vancouver, Canada., pp. II:508–515.
Kolmogorov, V. and Zabih, R. 2002. Multi-camera scene reconstruction via graph cuts. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, pp. 82–96.
Ohta, Y. and Kanade, T. 1985. Stereo by intra- and inter-scanline search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7(2):139–154.
Scharstein, D. 1999. View Synthesis Using Stereo Vision, vol. 1583 of Lecture Notes in Computer Science (LNCS). Springer-Verlag.
Scharstein, D. and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Computer Vision, 47(1–3):7–42.
Sun, J., Shum, H.Y., and Zheng, N.N. 2002. Stereo matching using belief propagation. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark.
Szeliski, R. 1999. Prediction error as a quality metric for motion and stereo. In Proc. Int. Conf. on Computer Vision, Kerkyra, Greece, pp. 781–788.
Vetter, T. 1998. Synthesis of novel views from a single face image. Int. J. Computer Vision, 28(2):103–116.
Yang, R. and Zhang, Z. 2002. Eye gaze correction with stereovision for video tele-conferencing. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, 2:479–494.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Criminisi, A., Blake, A., Rother, C. et al. Efficient Dense Stereo with Occlusions for New View-Synthesis by Four-State Dynamic Programming. Int J Comput Vision 71, 89–110 (2007). https://doi.org/10.1007/s11263-006-8525-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-006-8525-1