Overview of VideoCLEF 2008: Automatic Generation of Topic-Based Feeds for Dual Language Audio-Visual Content

Larson, Martha; Newman, Eamonn; Jones, Gareth J. F.

doi:10.1007/978-3-642-04447-2_119

Martha Larson²⁴,
Eamonn Newman²⁵ &
Gareth J. F. Jones²⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5706))

Included in the following conference series:

Workshop of the Cross-Language Evaluation Forum for European Languages

579 Accesses
16 Citations

Abstract

The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF piloted the Vid2RSS task, whose main subtask was the classification of dual language video (Dutch-language television content featuring English-speaking experts and studio guests). The task offered two additional discretionary subtasks: feed translation and automatic keyframe extraction. Task participants were supplied with Dutch archival metadata, Dutch speech transcripts, English speech transcripts and ten thematic category labels, which they were required to assign to the test set videos. The videos were grouped by class label into topic-based RSS-feeds, displaying title, description and keyframe for each video.

Five groups participated in the 2008 VideoCLEF track. Participants were required to collect their own training data; both Wikipedia and general web content were used. Groups deployed various classifiers (SVM, Naive Bayes and k-NN) or treated the problem as an information retrieval task. Both the Dutch speech transcripts and the archival metadata performed well as sources of indexing features, but no group succeeded in exploiting combinations of feature sources to significantly enhance performance. A small scale fluency/adequacy evaluation of the translation task output revealed the translation to be of sufficient quality to make it valuable to a non-Dutch speaking English speaker. For keyframe extraction, the strategy chosen was to select the keyframe from the shot with the most representative speech transcript content. The automatically selected shots were shown, with a small user study, to be competitive with manually selected shots. Future years of VideoCLEF will aim to expand the corpus and the class label list, as well as to extend the track to additional tasks.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

An Investigation of Cross-Language Information Retrieval for User-Generated Internet Video

Video and Audio Data Extraction for Retrieval, Ranking and Recapitulation (VADER $$^3$$ )

Multimodal video retrieval with CLIP: a user study

Article Open access 29 September 2023

Keywords

References

Calic, J., Sav, S., Izquierdo, E., Marlow, S., Murphy, N., O’Connor, N.: Temporal video segmentation for real-time key frame extraction. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, ICASSP 2002, Orlando, Florida (2002)
Google Scholar
Huijbregts, M., Ordelman, R., de Jong, F.: Annotation of heterogeneous multimedia content using automatic speech recognition. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 78–90. Springer, Heidelberg (2007)
Chapter Google Scholar
Jackson, P.: Natural Language Processing for Online Applications. Natural Language Processing. John Benjamins, Philadelphia (2002)
Book Google Scholar
Larson, M., Newman, E., Jones, G.: Classification of dual language audio-visual content: Introduction to the VideoCLEF 2008 pilot benchmark evaluation task. In: Proceedings of the SIGIR 2008 Workshop on Searching Spontaneous Conversational Speech, pp. 71–72 (2008)
Google Scholar
Paass, G., Leopold, E., Larson, M., Kindermann, J., Eickeler, S.: SVM classification using sequences of phonemes and syllables. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 373–384. Springer, Heidelberg (2002)
Chapter Google Scholar
Pecina, P., Hoffmannova, P., Jones, G.J.F., Zhang, Y., Oard, D.W.: Overview of the CLEF 2007 cross-language speech retrieval. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 674–686. Springer, Heidelberg (2008)
Chapter Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM, New York (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

EEMCS, Delft University of Technology, 2628, CD, Delft, Netherlands
Martha Larson
Centre for Digital Video Processing, Dublin City University, Dublin 9, Ireland
Eamonn Newman & Gareth J. F. Jones

Authors

Martha Larson
View author publications
You can also search for this author in PubMed Google Scholar
Eamonn Newman
View author publications
You can also search for this author in PubMed Google Scholar
Gareth J. F. Jones
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Istituto di Scienza e Tecnologie dell’Informazione, CNR, Pisa, Italy
Carol Peters
RWTH Aachen University, Aachen, Germany
Thomas Deselaers
University of Padua, Padua, Italy
Nicola Ferro
LSI-UNED, Madrid, Spain
Julio Gonzalo & Anselmo Peñas &
Dublin City University, Dublin 9, Ireland
Gareth J. F. Jones
Helsinki University of Technology, Espoo, Finland
Mikko Kurimo
University of Hildesheim, Hildesheim, Germany
Thomas Mandl
Humboldt University Berlin, Germany
Vivien Petras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Larson, M., Newman, E., Jones, G.J.F. (2009). Overview of VideoCLEF 2008: Automatic Generation of Topic-Based Feeds for Dual Language Audio-Visual Content. In: Peters, C., et al. Evaluating Systems for Multilingual and Multimodal Information Access. CLEF 2008. Lecture Notes in Computer Science, vol 5706. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04447-2_119

Download citation

DOI: https://doi.org/10.1007/978-3-642-04447-2_119
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04446-5
Online ISBN: 978-3-642-04447-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Overview of VideoCLEF 2008: Automatic Generation of Topic-Based Feeds for Dual Language Audio-Visual Content

Abstract

Chapter PDF

Similar content being viewed by others

An Investigation of Cross-Language Information Retrieval for User-Generated Internet Video

Video and Audio Data Extraction for Retrieval, Ranking and Recapitulation (VADER $$^3$$ )

Multimodal video retrieval with CLIP: a user study

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Overview of VideoCLEF 2008: Automatic Generation of Topic-Based Feeds for Dual Language Audio-Visual Content

Abstract

Chapter PDF

Similar content being viewed by others

An Investigation of Cross-Language Information Retrieval for User-Generated Internet Video

Video and Audio Data Extraction for Retrieval, Ranking and Recapitulation (VADER $$^3$$ )

Multimodal video retrieval with CLIP: a user study

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation