Coupled Visual and Kinematic Manifold Models for Tracking

Lee, C.-S.; Elgammal, A.

doi:10.1007/s11263-009-0266-5

Coupled Visual and Kinematic Manifold Models for Tracking

Published: 15 July 2009

Volume 87, pages 118–139, (2010)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

International Journal of Computer Vision Aims and scope Submit manuscript

Coupled Visual and Kinematic Manifold Models for Tracking

Download PDF

C.-S. Lee¹ &
A. Elgammal²

504 Accesses
47 Citations
Explore all metrics

Abstract

In this paper, we consider modeling data lying on multiple continuous manifolds. In particular, we model the shape manifold of a person performing a motion observed from different viewpoints along a view circle at a fixed camera height. We introduce a model that ties together the body configuration (kinematics) manifold and visual (observations) manifold in a way that facilitates tracking the 3D configuration with continuous relative view variability. The model exploits the low-dimensionality nature of both the body configuration manifold and the view manifold, where each of them are represented separately. The resulting representation is used for tracking complex motions within a Bayesian framework, in which the model provides a low-dimensional state representation as well as a constrained dynamic model for both body configuration and view variations. Experimental results estimating the 3D body posture from a single camera are presented for the HUMANEVA dataset and other complex motion video sequences.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Aggarwal, J. K., & Cai, Q. (1999). Human motion analysis: a review. Computer Vision and Image Understanding, 73(3), 428–440. http://dx.doi.org/10.1006/cviu.1998.0744.
Article Google Scholar
Agarwal, A., & Triggs, B. (2004). 3D human pose from silhuettes by relevance vector regression. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 882–888).
Brand, M. (1999). Shadow puppetry. In Proceedings of the international conference on computer vision (ICCV) (Vol. 2, pp. 1237–1244).
Campbell, L. W., & Bobick, A. F. (1995). Recognition of human body motion using phase space constraints. In Proceedings of the international conference on computer vision (ICCV) (p. 624).
Christoudias, C. M., & Darrell, T. (2005). On modelling nonlinear shape-and-texture appearance manifolds. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1067–1074).
Darrell, T., & Pentland, A. (1993). Space-time gesture. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 335–340).
Elgammal, A., & Lee, C. S. (2004a). Inferring 3D body pose from silhouettes using activity manifold learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 681–688).
Elgammal, A., & Lee, C. S. (2004b). Separating style and content on a nonlinear manifold. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 478–485).
Elgammal, A., & Lee, C. S. (2007). Nonlinear manifold learning for dynamic shape and dynamic appearance. Computer Vision and Image Understanding, 106(1), 31–46.
Article Google Scholar
Elgammal, A., & Lee, C. S. (2009). Tracking people on a torus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3), 520–538.
Article Google Scholar
Gavrila, D. M. (1999). The visual analysis of human movement: a survey. Computer Vision and Image Understanding, 73(1), 82–98. http://dx.doi.org/10.1006/cviu.1998.0716.
Article MATH Google Scholar
Gavrila, D., & Davis, L. (1996). 3-D model-based tracking of humans in action: a multi-view approach. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 73–80).
Grauman, K., Shakhnarovich, G., & Darrell, T. (2003). Inferring 3D structure with a statistical image-based shape model. In Proceedings of the international conference on computer vision (ICCV) (p. 641).
Hogg, D. (1983). Model-based vision: a program to see a walking person. Image and Vision Computing, 1(1), 5–20.
Article Google Scholar
Kakadiaris, I. A., & Metaxas, D. (1996). Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 81–87).
Lathauwer, L. D., de Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21(4), 1253–1278.
Article MATH MathSciNet Google Scholar
Lawrence, N. D. (2004). Gaussian process models for visualisation of high dimensional data. In Proceedings of advances in neural information processing (NIPS).
Lee, C. S., & Elgammal, A. (2005). Homeomorphic manifold analysis: Learning decomposable generative models for human motion analysis. In Workshop on dynamical vision.
Lee, C. S., & Elgammal, A. (2006). Simultaneous inference of view and body pose using torus manifolds. In Proceedings of the international conference on pattern recognition (ICPR) (pp. 489–494).
Li, R., Tian, T. P., & Sclaroff, S. (2007). Simultaneous learning of nonlinear manifold and dynamic models for high-dimensional time series. In ICCV 2007 (pp. 1–8).
Lin, R. S., Liu, C. B., Yang, M. H., Ahuja, N., & Levinson, S. (2006). Learning nonlinear manifolds from time series. In Proceedings of the European conference on computer vision (ECCV) (pp. 245–256).
Magnus, J. R., & Neudecker, H. (1988). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.
MATH Google Scholar
Moeslund, T. B., Hilton, A., & Krüger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2), 90–126.
Article Google Scholar
Moon, K., & Pavlovic, V. (2006). Impact of dynamics on subspace embedding and tracking of sequences. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 198–205).
Morariu, V. I., & Camps, O. I. (2006). Modeling correspondences for multi-camera tracking using nonlinear manifold learning and target dynamics. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 545–552).
Mori, G., & Malik, J. (2002). Estimating human body configurations using shape context matching. In Proceedings of the European conference on computer vision (ECCV) (pp. 666–680).
Murase, H., & Nayar, S. (1995). Visual learning and recognition of 3D objects from appearance. International Journal of Computer Vision, 14(1), 5–24.
Article Google Scholar
O’Rourke, J. (1980). Badler: model-based image analysis of human motion using constraint propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(6), 522–536.
Google Scholar
Poggio, T., & Girosi, F. (1990). Networks for approximation and learning. Proceedings of the IEEE, 78(9), 1481–1497.
Article Google Scholar
Rahimi, A., Recht, B., & Darrell, T. (2005). Learning appearance manifolds from video. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 868–875).
Rehg, J. M., & Kanade, T. (1995). Model-based tracking of self-occluding articulated objects. In Proceedings of the international conference on computer vision (ICCV) (pp. 612–617).
Rohr, K. (1994). Towards model-based recognition of human movements in image sequence. Computer Vision, Graphics, and Image Processing, 59(1), 94–115.
Article Google Scholar
Rosales, R., Athitsos, V., & Sclaroff, S. (2001). 3D hand pose reconstruction using specialized mappings. In Proceedings of the international conference on computer vision (ICCV) (pp. 378–387).
Roweis, S., & Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.
Article Google Scholar
Schlkopf, B., & Smola, A. (2002). Learning with Kernels: support vector machines, regularization, optimization and beyond. Cambridge: MIT Press.
Google Scholar
Shakhnarovich, G., Fisher, J. W., & Darrell, T. (2002). Face recognition from long-term observations. In Proceedings of the European conference on computer vision (ECCV) (pp. 851–865).
Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Proceedings of the international conference on computer vision (ICCV) (pp. 750–759).
Sidenbladh, H., Black, M. J., & Fleet, D. J. (2000). Stochastic tracking of 3D human figures using 2d image motion. In Proceedings of the European conference on computer vision (ECCV) (pp. 702–718).
Sigal, L., & Black, M. J. (2006). Humaneva: synchronized video and motion capture dataset for evaluation of articulated human motion (Technical Report CS-06-08). Brown University.
Sminchisescu, C., & Jepson, A. (2004). Generative modeling of continuous non-linearly embedded visual inference. In Proceedings of the international conference on machine learning (ICML) (pp. 140–147).
Sminchisescu, C., Kanaujia, A., Li, Z., & Metaxas, D. N. (2005). Discriminative density propagation for 3D human motion estimation. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 390–397).
Tenenbaum, J. B., & Freeman, W. T. (2000). Separating style and content with bilinear models. Neural Computation, 12, 1247–1283.
Article Google Scholar
Tenenbaum, J., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.
Article Google Scholar
Tian, T. P., Li, R., & Sclaroff, S. (2005). Articulated pose estimation in a learned smooth space of feasible solutions. In Workshop on learning in computer vision and pattern recognition.
Urtasun, R., Fleet, D. J., Hertzmann, A., & Fua, P. (2005). Priors for people tracking from small training sets. In Proceedings of the international conference on computer vision (ICCV) (pp. 403–410).
Urtasun, R., Fleet, D. J., & Fua, P. (2006). 3D people tracking with Gaussian process dynamical models. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 238–245).
Vasilescu, M. A. O. (2002). Human motion signatures: analysis, synthesis, recognition. In Proceedings of the international conference on pattern recognition (ICPR) (Vol. 3, pp. 456–460).
Vasilescu, M. A. O., & Terzopoulos, D. (2002). Multilinear analysis of image ensembles: tensorfaces. In Proceedings of the European conference on computer vision (ECCV) (pp. 447–460).
Wang, J., Fleet, D. J., & Hertzmann, A. (2005). Gaussian process dynamical models. In Proceedings of advances in neural information processing (NIPS).
Yacoob, Y., & Black, M. J. (1999). Parameterized modeling and recognition of activities. Computer Vision and Image Understanding, 73(2), 232–247.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, School of Electronic Engineering, Communication Engineering and Computer Science, Yeungnam University, Gyeongsan, South Korea
C.-S. Lee
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
A. Elgammal

Authors

C.-S. Lee
View author publications
You can also search for this author in PubMed Google Scholar
A. Elgammal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C.-S. Lee.

Electronic Supplementary Material

Below is the link to the electronic supplementary material. (WMV 802 KB)

Below is the link to the electronic supplementary material. (WMV 3.24 MB)

Below is the link to the electronic supplementary material. (WMV 2.32 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, CS., Elgammal, A. Coupled Visual and Kinematic Manifold Models for Tracking. Int J Comput Vis 87, 118–139 (2010). https://doi.org/10.1007/s11263-009-0266-5

Download citation

Received: 13 January 2008
Accepted: 29 June 2009
Published: 15 July 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11263-009-0266-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Coupled Visual and Kinematic Manifold Models for Tracking

Abstract

Article PDF

Similar content being viewed by others

Eigen Appearance Maps of Dynamic Shapes

Modal Space: A Physics-Based Model for Sequential Estimation of Time-Varying Shape from Monocular Video

Combining Local-Physical and Global-Statistical Models for Sequential Deformable Shape from Motion

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Coupled Visual and Kinematic Manifold Models for Tracking

Abstract

Article PDF

Similar content being viewed by others

Eigen Appearance Maps of Dynamic Shapes

Modal Space: A Physics-Based Model for Sequential Estimation of Time-Varying Shape from Monocular Video

Combining Local-Physical and Global-Statistical Models for Sequential Deformable Shape from Motion

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation