Determining a structured spatio-temporal representation of video content for efficient visualization and indexing

Gelgon, Marc; Bouthemy, Patrick

doi:10.1007/BFb0055692

Marc Gelgon¹ &
Patrick Bouthemy¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1406))

Included in the following conference series:

European Conference on Computer Vision

1224 Accesses
17 Citations

Abstract

Efficient access to information contained in video databases implies that a structured representation of the content of the video is built beforehand. This paper describes an approach in this direction, targeted at video indexing and browsing. Exploiting a 2D motion model estimator, we partition the video into shots, characterize camera motion, extract and track mobile objects. These steps rely on robust motion estimation, statistical tests and contextual statistical labeling. The content of each shot can then be viewed on a synoptic frame composed of a mosaic image of the background scene, on which trajectories of mobile objects are superimposed. The proposed method also provides instantaneous and long-term, qualitative and quantitative object motion cues for content-based indexing. Its different steps and the system they form are designed to keep computational cost low, while being able to cope with general video content was aimed at. We provide experimental results on real-world sequences. The structured output opens important possible extensions, for instance in the direction of higher-level interpretation.

Download to read the full chapter text

Chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

P. Aigrain and P. Joly.-The automatic real-time analysis of film editing and transition effects and its applications.-Computer & Graphics, 18(1):93–103, 1994.
Article Google Scholar
P. Aigrain, H.J. Zhang, and D. Petkovic.-Content-based representation and retrieval of visual media: a state-of-the-art review.-Multimedia Tools and Applications, 3(3): 179–202, November 1996.
Article Google Scholar
S. Ayer and H.S Sawhney.-Compact representations of videos through dominant and multiple motion estimation.-IEEE Trans. on Pattern Analysis and Machine Intelligence, 18(8):814–830, August 1996.
Article Google Scholar
M. Basseville.-Detecting changes in signals and systems — a survey.-Automatica, 24(3):309–326, 1988.
Article MATH MathSciNet Google Scholar
J.S. Boreczky and L.A. Rowe.-Comparison of video shot boundary detection techniques.-In In I.K. Sethi and R.C. Jain, editors, Proceedings of IS-T/SPIE Conference on Storage and Retrieval for Image and Video Databases IV, Vol. SPIE 2670, pages 170–179, 1996.
Google Scholar
P. Bouthemy and F. Ganansia.-Video partitioning and camera motion characterization for content-based video indexing.-In Proc. of 3rd IEEE Int. Conf. on Image Processing, volume I, pages 905–909, Lausanne, Sept 1996.
Google Scholar
C. Castel, L. Chaudron, and C. Tessier.-What is going on? A high level interpretation of sequences of images.-In 4th European Conf. on Computer Vision,, Cambridge UK, April 1996.-LNCS 1065.
Google Scholar
J.D. Courtney.-Automatic video indexing via object motion analysis.-Pattern Recognition, 30(4):607–625, April 1997.
Article Google Scholar
M. De Marsico, L. Cinque, and S Levialdi.-Indexing pictorial documents by their content: a survey of current techniques.-Image and Vision Computing, (15):119–141, 1997.
Google Scholar
A. Del Bimbo, E. Vicario, and D. Zingoni.-Symbolic description and visual querying of image sequences using spatio-temporal logic.-IEEE Trans. on Knowledge and Data Engineering, 7(4):609–621, August 1995.
Article Google Scholar
M. Flickner et al.-Query by image and video content: the QBIC system.-IEEE Computer, pages 23–32, Sept. 1995.
Google Scholar
E. FranÇois and P. Bouthemy.-Derivation of qualitative information in motion analysis.-Image and Vision Computing, 8(4):279–287, Nov. 1990.
Article Google Scholar
M. Gelgon and P. Bouthemy.-A region-level graph labeling approach to motionbased segmentation.-In Proc. of Conf. on Computer Vision and Pattern Recognition, pages 514–519, Puerto-Rico, June 1997.
Google Scholar
F. Idris and S. Panchanathan.-Review of image and video indexing techniques.-Jal of Visual Communication and Image Representation, 8(2):146–166, June 1997.
Article Google Scholar
M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu.-Efficient representations of video sequences and their applications.-Signal Processing: Image Communication, (8):327–351, 1996.
Google Scholar
M. Irani, B. Rousso, and S. Peleg.-Detecting and tracking multiple moving objects using temporal integration.-In Proc. of Second European Conference on Computer Vision, pages 282–287, Santa Margherita Ligure, Italy, May 1992.
Google Scholar
A. Nagasaka and Y. Tanaka.-Automatic video indexing and full-video search for objects appearances.-Visual Database Systems II, pages 113–127, 1992.-E. Knuth and L.M. Wegner (eds.), Elsevier Science Publ.
Google Scholar
J.M Odobez and P. Bouthemy.-Robust multiresolution estimation of parametric motion models.-Jal of Visual Communication and Image Representation, 6(4):348–365, December 1995.
Article Google Scholar
N.V. Patel and I.K. Sethi.-Video shot detection and characterization for video databases.-Pattern Recognition, 30(4):607–625, April 1997.
Article Google Scholar
B. Rousso, S. Peleg, I. Finci, and A. Rav-Acha.-Universal mosaicing using pipe projection.-In Proc. of IEEE International Conf. on Computer Vision (ICCV'98), pages 945–952, Bombay, India, January 1999.
Google Scholar
H. Sawhney and R. Kumar.-True multi image alignment and its application to mosaicing and lens distorsion correction.-In Proc. of Conf. on Computer Vision and Pattern Recognition, pages 450–456, Puerto-Rico, June 1997.
Google Scholar
C. Schmid and R. Mohr.-Combining greyvalue invariants with local constraints for object recognition.-In Proc. of Conf. on Computer Vision and Pattern Recognition, pages 872–877, San Francisco, USA., June 1996.
Google Scholar
M.A Smith and T. Kanade.-Video skimming and characterization through the combination of image and language understanding techniques.-In Proc. of Conf. on Computer Vision and Pattern Recognition, pages 775–781, Puerto-Rico, June 1997.
Google Scholar
C. Stiller.-Object-oriented estimation of dense motion fields.-IEEE Trans. on Image Processing, 6(2), February 1997.
Google Scholar
J.Y.A Wang and E.H Adelson.-Representing moving images with layers.-IEEE Trans. on Image Processing, 3(5):625–638, September 1994.
Article Google Scholar
H.J Zhang, A. Kankanhalli, and S.W. Smoliar.-Automatic partitioning of fullmotion video.-Multimedia Systems, 1:10–28, 1993.
Article Google Scholar
H.J Zhang, J. Wu, D. Zhong, and S.W. Smoliar.-An integrated system for content-based video retrieval and browsing.-Pattern Recognition, 30(4):643–658, April 1997.
Article Google Scholar

Download references

Author information

Authors and Affiliations

IRISA/INRIA, Campus universitaire de Beaulieu, 35042, Rennes cedex, France
Marc Gelgon & Patrick Bouthemy

Authors

Marc Gelgon
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Bouthemy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Hans Burkhardt Bernd Neumann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gelgon, M., Bouthemy, P. (1998). Determining a structured spatio-temporal representation of video content for efficient visualization and indexing. In: Burkhardt, H., Neumann, B. (eds) Computer Vision — ECCV'98. ECCV 1998. Lecture Notes in Computer Science, vol 1406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0055692

Download citation

DOI: https://doi.org/10.1007/BFb0055692
Published: 28 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64569-6
Online ISBN: 978-3-540-69354-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics