Finding Person X: Correlating Names with Visual Appearances

Yang, Jun; Chen, Ming-yu; Hauptmann, Alex

doi:10.1007/978-3-540-27814-6_34

Jun Yang²⁰,
Ming-yu Chen²⁰ &
Alex Hauptmann²⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3115))

Included in the following conference series:

International Conference on Image and Video Retrieval

1003 Accesses
33 Citations
1 Altmetric

Abstract

People as news subjects carry rich semantics in broadcast news video and therefore finding a named person in the video is a major challenge for video retrieval. This task can be achieved by exploiting the multi-modal information in videos, including transcript, video structure, and visual features. We propose a comprehensive approach for finding specific persons in broadcast news videos by exploring various clues such as names occurred in the transcript, face information, anchor scenes, and most importantly, the timing pattern between names and people. Experiments on the TRECVID 2003 dataset show that our approach achieves high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Person Search in Videos with One Portrait Through Visual and Temporal Links

Identity-Aware Multi-sentence Video Description

Naming multi-modal clusters to identify persons in TV broadcast

Article 01 July 2015

References

Smeulders, et al.: Content-Based Image Retrieval at the End of the Early Years. IEEE Trans. Pattern Analysis and Machine Intelligence 22(12), 1349–1379 (2000)
Article Google Scholar
Zhang, H.J., Kankanhalli, A., Smoliar, S.W.: Automatic partitioning of full-motion video. ACM Multimedia Systems 1(1) (1993)
Google Scholar
Hauptmann, A., et al.: Informedia at TRECVID 2003: Analyzing and Searching Broadcast News Video. In: Proceedings of TREC 2003 (2003)
Google Scholar
Satoh, S., Kanade, K.: NAME-IT: Association of Face and Name in Video. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 775–781 (1997)
Google Scholar
The NIST TREC Video Retrieval Evaluation, http://www-nlpir.nist.gov/projects/trecvid/
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
MATH Google Scholar
Baeza-Yates, R., Ribeiro-Neto, N.: Modern Information Retrieval. Addison Wesley, Essex (1999)
Google Scholar
Zhai, C., Lafferty, J.: A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In: Proc. 24th Int’l ACM SIGIR Conf, pp. 334–342 (2001)
Google Scholar
Pentland, A., Moghaddam, B.: Starne,r T.: View-Based and Modular Eigenspaces for Face Recognition IEEE Conference on Computer Vision & Pattern Recognition (1994)
Google Scholar
Schneiderman, H., Kanade, T.: Object Detection Using the Statistics of Parts. International Journal of Computer Vision (2003)
Google Scholar
Chen, M.Y., Hauptmann, A.: Searching for a Specific Person in Broadcast News Video. In: Int’l Conf. on Acoustics, Speech, and Signal Processing (May 2004) (to appear)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Jun Yang, Ming-yu Chen & Alex Hauptmann

Authors

Jun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ming-yu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Alex Hauptmann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, Mathematical and Information Sciences, University of Brighton, UK
Peter Enser
Informatics and Telematics Institute, Centre for Research and Technology-Hellas, 57001, Thessaloniki, Greece
Yiannis Kompatsiaris
Centre for Digital Video Processing, Adaptive Information Cluster, Dublin City University, Ireland
Noel E. O’Connor
Dublin City University, Dublin, Ireland
Alan F. Smeaton
ISLA lab, Informatics Institute, University of Amsterdam, The Netherlands
Arnold W. M. Smeulders

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, J., Chen, My., Hauptmann, A. (2004). Finding Person X: Correlating Names with Visual Appearances. In: Enser, P., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds) Image and Video Retrieval. CIVR 2004. Lecture Notes in Computer Science, vol 3115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27814-6_34

Download citation

DOI: https://doi.org/10.1007/978-3-540-27814-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22539-3
Online ISBN: 978-3-540-27814-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Finding Person X: Correlating Names with Visual Appearances

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Person Search in Videos with One Portrait Through Visual and Temporal Links

Identity-Aware Multi-sentence Video Description

Naming multi-modal clusters to identify persons in TV broadcast

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Finding Person X: Correlating Names with Visual Appearances

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Person Search in Videos with One Portrait Through Visual and Temporal Links

Identity-Aware Multi-sentence Video Description

Naming multi-modal clusters to identify persons in TV broadcast

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation