Abstract
We address the problem of computing the three-dimensional motions of objects in a long sequence of stereo frames. Our approach is bottom-up and consists of two levels. The first level deals with the tracking of 3D tokens from frame to frame and the estimation of their kinematics. The processing is completely parallel for each token. The second level groups tokens into objects based on their kinematic parameters, controls the processing at the low level to cope with problems such as occlusion, disappearance, and appearance of tokens, and provides information to other components of the system. We have implemented this approach using 3D line segments obtained from stereo as the tokens. We use classical kinematics and derive closed-form solutions for some special, but useful, cases of motions. The motion computation problem is then formulated as a tracking problem in order to apply the extended Kalman filter. The tracking is performed in a prediction-matching-update loop in which multiple matches can be handled. Tokens are labeled by a number called its support of existence which measures their adequation to the measurements. If this number goes beyond a threshold, the token disappears. The individual line segments can be grouped into rigid objects according to the similarity of their kinematic parameters. Experiments using synthetic and real data have been carried out and the results found to be quite good.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aggarwal, J.K. and Wang, Y.F. 1987. Analysis of a sequence of images using point and line correspondences. Proc. Intern. Conf. Robotics Autom. pp. 1275–1280, Raleigh, NC, March 31–April 3.
Ayache, N. 1988. Construction et Fusion de Represenntations Visuelles 3D—Applications à la Robotique Mobile. Thése d'Etat, University of Paris XI, Paris-Orsay.
Ayache, N., and Faugeras, O.D. 1987a. Building, registrating and fusing noisy visual maps. Proc. 1st Intern. Conf. Comput. Vis. London, June. pp. 73–82.
Ayache, N., and Faugeras, O.D. 1987b. Maintaining representations of the environment of a mobile robot. 4th Intern. Symp. Robotics Res. Santa Cruz, CA. MIT Press: Cambridge MA.
Ayache, N., and Faugeras, O.D. 1989. Maintaining representations of the environment of a mobile robot. IEEE Trans. Robotics Autom. 5(6):804–819. December; also INRIA report 789.
Ayache, N., and Lustman, F. 1987. Fast and reliable trinocular stereovision Proc. 1st Intern. Conf. Comput. Vis., London, June, pp. 422–427.
Baker, H., and Binford, T.O. 1981. Depth from edge and intensity based stereo. Proc. 7th Joint Conf. Artif. Intell. Vancouver, August, pp. 631–636.
Bar-Shalom, Y., and Fortmann, T.E. 1988. Tracking and Data Association. Academic Press: San Diego, CA.
Broida, T.J. and Chellappa, R. 1986. Kinematics and structure of a rigid object from a sequence of noisy images. Proc. IEEE Workshop on Motion: Representation and Analysis, Charleston, SC, May, pp. 95–100.
Broida, T.J. and Chellappa, R. 1989. Experiments and uniqueness results on object structure and kinematics from a sequence of monocular images. Proc. IEEE Workshop on Visual Motion, Irvine, CA, March, pp. 21–30.
Crowley, J.L., Stelmaszyk, P., and Discours, C. 1988. Measuring image flow by tracking edge-lines. Proc. 2nd Intern. Conf. Comp, Vis. Tampa, FL, December, pp. 658–664.
Deriche, R., and Faugeras, O. 1990. Tracking line segments. In Proc. European Conference on Computer Vision, O.Faugeras, ed. Springer-Verlag: Antibes, Fr, April, pp. 259–268.
Dickmanns, E.D. 1987. 4D-dynamic seene analysis with integral spatio-temporal models. Proc. ISSR'87 Santa Cruz, pp. 73–80.
Dickmanns, E.D., and Graefe, V. 1988a. Applications of dynamic monocular vision. Mach. Vis. Appl. 1:241–261.
Dickmanns, E.D., and Graefe, V. 1988b. Dynamic monocular machine vision. Mach. Vis. Appl. 1:223–240.
Durrant-Whyte, H.F. 1988. Integration, Coordination, and Control of Multi-sensor. Kluwer Academic Publishers: Norwell, MA.
Faugeras, O.D. 1990. On the motion of 3-D curves and its relationship to optical flow. Proc. 1st Europ. Conf. Comp. Vis. Springer-Verlag, April, pp. 107–117.
Faugeras, O.D. 1991. Three-Dimensional Computer Vision. MIT Press, Cambridge, MA. To appear.
Faugeras, O.D., and Maybank, S. 1990. Motion from point matches: Multiplicity of solutions. Intern. J. Comput. Vis. 4 (3):225–246; also INRIA Technical Report 1157.
Faugeras, O.D., and Toscani, G. 1986. The calibration problem for stereo. Proc. IEEE Conf. Comput. Vis. Patt. Recog., Miami Beach. Fl, June, pp. 15–20.
Faugeras, O.D., Ayache, N., and Zhang, Z. 1988a. A preliminary investigation of the problem of determining ego-and object-motions from stereo. Proc. 9th Intern. Conf. Patt. Recog. Rome, pp. 242–246.
Faugeras, O.D., Deriche, R., Ayache, N., Lustman, F., and Giuliano, E. 1988b. Depth and motion analysis: The machine being developed within Esprit Project 940. Proc. IAPR Workshop on Comput. Vis. (Special Hardware and Industrial Applications), October, Tokyo, pp. 35–44.
Faugeras, O.D., Lebras-Mehlman, E., and Boissonnat, J.D. 1990. Representing stereo data with the Delaunay triangulation. Artif. Intell. J. 44(1–2); also INRIA Technical Report 788.
Gambotto, J.P. 1989. Tracking points and line segments in image sequences. Proc. IEEE Workshop on Visual Motion, Irvine, CA, March 20–22, pp. 38–45.
Gennery, D.B. 1982. Tracking known three-dimensional objects. Proc. Amer. Assoc. Artif. Intell., Carnegie-Mellon University, Pittsburgh.
Gibson, J.J. 1950. The Perception of the Visual World. Houghton-Mifflin: Boston, MA.
Gordon, G.L. 1989. On the tracking of featureless objects with occlusion. Proc. IEEE Workshop of Visual Motion, Irvine, CA, March 20–22, pp. 13–20.
Grimson, W.E.L. 1981. From Images to Surfaces. MIT Press: Cambridge, MA
Grimson, W.E.L. 1985. Computational experiments with a feature based stereo algorithm. IEEE Trans. Patt. Anal. Mach. Intell. 7(1): 17–34.
Hildreth, C. 1984. The Measurement of Visual Motion. MIT Press: Cambridge, MA.
Horn, K.P., and Schunk, B.G. 1981. Determining optical flow. Artificial Intelligence 17:185–203.
Huang, T.S., and Tsai, R.Y. 1981. Image sequence analysis: Motion estimation. In T.S.Huang, ed., Image Sequence Processing and Dynamic Scene Analysis. Springer-Verlag: New York.
Hwang, V.S.S. 1989. Tracking feature points in time-varying images using an opportunistic selection approach. Pattern Recognition, 22(3):247–256.
Jenkin, M., and Tsotsos, J.K. 1986. Applying temporal constraints to the dynamic stereo problem. Compjt. Vis. Graph. Image Process., 24:16–32.
Kim, Y.C., and Aggarwal, J.K. 1987. Determining object motion in a sequence of stereo images. IEEE J. Robot. Autom. 3(6):599–614.
Kitamura, Y., and Yachida, M. 1990. Three-dimensional data acquisition by trinocular vision. Advanced Robotics 4(1):29–42, Robotics Society of Japan.
Koenderink, J.J. 1986. Optic flow. Vision Research 26(1):161–180.
Koenderink, J.J., and vanDoorn, A.J. 1975. Invariant properties of the motion parallax field due to the movement of rigid bodies relative to an observer. Optica Acta 22:717–723.
Koenderink, J.J., and vanDoorn, A.J. 1978. How an Ambulant Observer Can Construct a Model of the Environment from the Geometrical Structure of the Visual Inflow, Oldenburg: Muenchen.
Liu, Y., and Huang, T.S. 1986. Estimation of rigid body motion using straight line correspondences. Proc. Workshop on Motion: Representation and Analysis, Charleston, May, pp. 47–51.
Liu, Y., and Huang, T.S. 1988. A linear algorithm for determining motion and structure from line correspondences. Comput. Vis. Graph. Image Process. 44(1):35–57.
Longuet-Higgins, H.C. 1981. A computer algorithm for reconstructing a scene from two projections. Nature 293:133–135.
Lowerre, B., and Reddy, R. 1980. The harpy speech understanding system. In W.Lea, ed. Trends in Speech Recognition. Prentice-Hall: Englewood Cliffs, NJ.
Marr, D., and Poggio, T. 1975. Cooperative computation of stereo disparity. Science 194:283–287.
Marr, D., and Poggio, T. 1979. A computational theory of human stereo vision. Proc. Roy. Soc., B-204:301–328.
Maybeck, P.S. 1979. Stochastic Models, Estimation and Control. vol. 1. Academic Press: San Diego, CA.
Maybeck, P.S. 1982. Stochastic Models, Estimation and Control. vol. 2. Academic Press: San Diego, CA.
Nagel, H.H. 1983. Displacement vectors derived from second order intensity variations in image sequences. Comput. Vis. Graph. Image Process. 21:85–117.
Nagel, H.H. 1986. Image sequences-ten (octal) years-from phenomenology towards a theoretical foundation. Proc. 8th Intern Conf. Patt. Recog. Paris, October, pp. 1174–1185.
Nishihara, H.K. 1984. PRISM, a practical real-time imaging stereo matcher. Mechnical Report A.I. Memo 780, MIT, Cambridge, MA, 31 pages.
Ohta, Y., and T.Kanade 1985. Stereo by intra- and inter-scanline search. IEEE Trans. Patt. Anal. Mach. Intell. 7(2):139–154.
Pollard, S.B., Mayhew, J.E.W., and Frisby, J.P. 1985. PMF: A stereo correspondence algorithm using a disparity gradient constraint. Perception, 14;449–470.
Preparata, F., and Shamos, M. 1986. Computational Geometry, An Introduction. Springer-Verlag: New York.
Roach, J.W., and Aggarwal, J.K. 1979. Computer tracking of objects moving in space. IEEE Trans. Patt. Anal. Mach. Intell. 1(2):127–135.
Roberts, K.S. 1988. A new representation for a line. Proc. Conf. Comput. Vis. Patt. Recog. Ann Arbor, June 5–9, pp. 635–640.
Rodrigues, O. 1840. Des lois géométriques qui régissent les déplacements d'un système solide dans l'espace, et de la variation des coordonnées provenant de ces déplacements considérés indépendamment des causes qui peuvent les produire. Journal de Mathematiques Pures et Appliquées 5:380–440.
Sethi, S.K., and Jain, R. 1987. Finding trajectories of feature points in a monocular image sequence. IEEE Trans. Patt. Anal. Mach. Intell. 9(1):56–73.
Tsai, R.Y., and Huang, T.S. 1981. Estimating 3-D motion parameters of a rigid planar patch, i. IEEE Trans. Acoustic, Speech Sig. Process. 29(6):1147–1152.
Tsai, R.Y., and Huang, T.S. 1984. Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surface. IEEE Trans. Patt. Anal. Mach. Intell. 6(1):13–26.
Ullman, S. 1979. The Interpretation of Visual Motion. MIT Press: Cambridge, MA.
Webb, J.A., and Aggarwal, J.K. 1982. Structure from motion of rigid and jointed objects. Artificial Intelligence. 19:107–130.
Weng, J., Huang, T.S., and Ahuja, N. 1987. 3-D motion estimation, understanding, and prediction from noisy image sequences. IEEE Trans. Patt. Anal. Mach. Intell. 9(3):370–389.
Yachida, M. 1986. 3D data acquisition by multiple views. In O.D.Faugeras and G.Giralt, eds. Robotics Research: the Third International Symposium. MIT Press, Cambridge, MA. pp. 11–18.
Yen, B.L. and Huang, T.S. 1983. Determining 3-D motion/structure of a rigid body over 3 frames using straight line correspondences. Proc. Conf. Comput. Vis. Patt. Recog., Washington, DC, June 19–23, pp. 267–272.
Young, G.S., and Chellappa, R. 1983. 3-D motion estimation using a sequence of noisy stereo images. Proc. Conf. Comput. Vis. Patt. Recog., Ann Arbor, pp. 710–716.
Zhang, Z. 1990. Motion Analysis from a Sequence of Stereo Frames and its Applications. PhD Thesis, University of Paris-Sud, Orsay, Paris, France, in English.
Zhang, Z., and Faugeras, O.D. 1990a. Building a 3D world model with a mobile robot: 3D line segment representation and integration. Proc. 10th Intern. Conf. Patt. Recog., Atlantic City, June, pp. 38–42.
Zhang, Z., and Faugeras, O.D. 1990b. Tracking and motion estimation in a sequence of stereo frames. In L.C. Alello, ed. Proc. 9th Europ. Conf. Artif. Intell, Stockholm, August, pp. 747–752.
Zhang, Z., and Faugeras, O.D. 1991. Determining motion from 3D line segments: a comparative study. Image and Vis. Comput. 9(1): 10–19.
Zhang, Z., and Faugeras, O.D. 1992. Estimation of displacements from two 3D frames obtained from stereo. IEEE Trans. Patt. Anal. Mach. Intell., accepted, to appear. 1992.
Zhang, Z., Faugeras, O.D., and Ayache, N. 1988. Analysis of a sequence of stereo scenes containing multiple moving objects using rigidity constraints. Proc. 2nd Intern. Conf. Comput. Vis., Tampa, FL, December, pp. 177–186.
Zhuang, X., and Haralick, R.M. 1985. Two view motion analysis. Proc. Conf. Comput. Vis. Patt. Recog., San Francisco, California, June, pp. 686–690.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zhang, Z., Faugeras, O.D. Three-dimensional motion computation and object segmentation in a long sequence of stereo frames. Int J Comput Vision 7, 211–241 (1992). https://doi.org/10.1007/BF00126394
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00126394