Relevance Assessment for Visual Video Re-ranking

Aldana-Iuit, Javier; Chum, Ondřej; Matas, Jiři

doi:10.1007/978-3-319-11758-4_46

Javier Aldana-Iuit¹⁷,
Ondřej Chum¹⁷ &
Jiři Matas¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8814))

Included in the following conference series:

International Conference Image Analysis and Recognition

2136 Accesses

Abstract

The following problem is considered: Given a name or phrase specifying an object, collect images and videos from the internet possibly depicting the object using a textual query on their name or annotation. A visual model from the images is built and used to rank the videos by relevance to the object of interest. Shot relevance is defined as the duration of the visibility of the object of interest. The model is based on local image features. The relevant shot detection builds on wide baseline stereo matching. The method is tested on 10 text phrases corresponding to 10 landmarks. The pool of 100 videos collected querying You-Tube with includes seven relevant videos for each landmark. The implementation runs faster than real-time at 208 frames per second. Averaged over the set of landmarks, at recall 0.95 the method has mean precision of 0.65, and the mean Average Precision (mAP) of 0.92.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Video Search with Context-Aware Ranker and Relevance Feedback

VERGE in VBS 2022

Rocchio-Based Relevance Feedback in Video Event Retrieval

Keywords

References

Arandjelović, R., Zisserman, A.: Multiple queries for large scale specific object retrieval. In: British Machine Vision Conference (2012)
Google Scholar
Boreczky, J.S., Rowe, L.A.: Comparison of video shot boundary detection techniques. In: Storage and Retrieval for Still Image and Video Databases IV, pp. 170–179 (1996)
Google Scholar
Chum, O., Matas, J., Kittler, J.: Locally optimized ransac. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 236–243. Springer, Heidelberg (2003). http://dx.doi.org/10.1007/978-3-540-45243-0_31
Chapter Google Scholar
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: ICCV (2007)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise, pp. 226–231. AAAI Press (1996)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Koniusz, P., Yan, F., Mikolajczyk, K.: Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection. Computer Vision and Image Understanding 117(5), 479–492 (2013). http://www.sciencedirect.com/science/article/pii/S1077314212001725
Article Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)
Google Scholar
Matas, J., Obdrzlek, S., Chum, O.: Local affine frames for wide-baseline stereo. In: ICPR (4), pp. 363–366 (2002), http://dblp.uni-trier.de/db/conf/icpr/icpr2002-4.html#MatasOC02
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. Int. J. Comput. Vision 65(1–2), 43–72 (2005). http://dx.doi.org/10.1007/s11263-005-3848-x
Article Google Scholar
Mikulík, A., Perdoch, M., Chum, O., Matas, J.: Learning a fine vocabulary. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 1–14. Springer, Heidelberg (2010)
Chapter Google Scholar
Mishkin, D., Perdoch, M., Matas, J.: Two-view matching with view synthesis revisited. In: IVCNZ, pp. 436–441 (2013)
Google Scholar
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: International Conference on Computer Vision Theory and Application (VISSAPP 2009), pp. 331–340. INSTICC Press (2009)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)
Google Scholar
Sivic, J., Schaffalitzky, F., Zisserman, A.: Object level grouping for video shots. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3022, pp. 85–98. Springer, Heidelberg (2004)
Chapter Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003)
Google Scholar
Turcot, P., Lowe, D.G.: Better matching with fewer features: The selection of useful features in large database recognition problems. In: ICCV Workshop LAVD (2009)
Google Scholar
Weyand, T., Leibe, B.: Discovering favorite views of popular places with iconoid shift. In: Metaxas, D.N., Quan, L., Sanfeliu, A., Gool, L.J.V. (eds.) ICCV, pp. 1132–1139. IEEE (2011), http://dblp.uni-trier.de/db/conf/iccv/iccv2011.html#WeyandL11

Download references

Author information

Authors and Affiliations

Department of Cybernetics, Faculty of Electrical Engineering, Center for Machine Perception, Czech Technical University in Prague, Karlovo nam. 13, 121 35, Prague 2, Czech Republic
Javier Aldana-Iuit, Ondřej Chum & Jiři Matas

Authors

Javier Aldana-Iuit
View author publications
You can also search for this author in PubMed Google Scholar
Ondřej Chum
View author publications
You can also search for this author in PubMed Google Scholar
Jiři Matas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Javier Aldana-Iuit .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Porto, Porto, Portugal
Aurélio Campilho
Dept. of Electrical and Computer Eng., University of Waterloo, Waterloo, Ontario, Canada
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aldana-Iuit, J., Chum, O., Matas, J. (2014). Relevance Assessment for Visual Video Re-ranking. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8814. Springer, Cham. https://doi.org/10.1007/978-3-319-11758-4_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-11758-4_46
Published: 10 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11757-7
Online ISBN: 978-3-319-11758-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Relevance Assessment for Visual Video Re-ranking

Abstract

Chapter PDF

Similar content being viewed by others

Video Search with Context-Aware Ranker and Relevance Feedback

VERGE in VBS 2022

Rocchio-Based Relevance Feedback in Video Event Retrieval

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Relevance Assessment for Visual Video Re-ranking

Abstract

Chapter PDF

Similar content being viewed by others

Video Search with Context-Aware Ranker and Relevance Feedback

VERGE in VBS 2022

Rocchio-Based Relevance Feedback in Video Event Retrieval

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation