Fusing Visual and Textual Retrieval Techniques to Effectively Search Large Collections of Wikipedia Images

Lau, C.; Tjondronegoro, D.; Zhang, J.; Geva, S.; Liu, Y.

doi:10.1007/978-3-540-73888-6_34

C. Lau¹,
D. Tjondronegoro¹,
J. Zhang¹,
S. Geva¹ &
…
Y. Liu¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4518))

Included in the following conference series:

International Workshop of the Initiative for the Evaluation of XML Retrieval

635 Accesses
1 Citations

Abstract

This paper presents an experimental study that examines the performance of various combination techniques for content-based image retrieval using a fusion of visual and textual search results. The evaluation is comprehensively benchmarked using more than 160,000 samples from INEX-MM2006 images dataset and the corresponding XML documents. For visual search, we have successfully combined Hough transform, Object’s color histogram, and Texture (H.O.T). For comparison purposes, we used the provided UvA features. Based on the evaluation, our submissions show that Uva+Text combination performs most effectively, but it is closely followed by our H.O.T- (visual only) feature. Moreover, H.O.T+Text performance is still better than UvA (visual) only. These findings show that the combination of effective text and visual search results can improve the overall performance of CBIR in Wikipedia collections which contain a heterogeneous (i.e. wide) range of genres and topics.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Improving Text-Based Image Search with Textual and Visual Features Combination

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

CATIRI: An Efficient Method for Content-and-Text Based Image Retrieval

Article 22 March 2019

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state of the art and challenges, ACM Transactions on Multimedia Computing. Communications and Applications 2, 1–19 (2006)
Google Scholar
Kherfi, M.L., Ziou, D., Bernardi, A.: Image retrieval from the World Wide Web: Issues, techniques and systems. ACM Computing Surveys 36, 35–67 (2004)
Article Google Scholar
Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1349–1380 (2000)
Article Google Scholar
von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the 2004 conference on Human factors in computing systems, pp. 319–326 (2004)
Google Scholar
Liu, Y., Zhang, D., Lu, G., Ma, W.-Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recognition 40, 262–282 (2007)
Article MATH Google Scholar
Zhang, R., Zhang, Z.M., Li, M., Ma, W.-Y., Zhang, H.-J.: A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimedia Systems 12, 27–33 (2006)
Article Google Scholar
Tong, H., He, J., Li, M., Zhang, C.: Graph based multi-modality learning. In: Proceedings of the 13th annual ACM international conference on Multimedia, Hilton, Singapore (2005)
Google Scholar
Gevers, T., Smeulders, A.W.M.: Content-based image retrieval: an overview. In: Gerard Mediomi, S.B.k. (ed.) Emerging Topics in Computer Vision, pp. 333–384, IMSC, USA (2004)
Google Scholar
Gong, Z., Liu, Q., Zhang, J.: Web image retrieval refinement by visual contents. In: Yu, J.X., Kitsuregawa, M., Leong, H.V. (eds.) WAIM 2006. LNCS, vol. 4016, Springer, Heidelberg (2006)
Chapter Google Scholar
Zhao, R., Grosky, W.: Narrowing the Semantic Gap - Improved Text-Based Web Document Retrieval Using Visual Features. IEEE TRANSACTIONS ON MULTIMEDIA 4, 189–200 (2002)
Article Google Scholar
Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-based soft annotation for multimodal image retrieval using Bayes Point Machines. IEEE Transactions on Circuits and Systems for Video Technology 13, 26–38 (2003)
Article Google Scholar
Martinez-Fernandez, J.V.R.J.L., Garcia-Serrano, A.M., Gonzalez-Cristobal, J.C.: Combining textual and visual features for image retrieval. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 680–691. Springer, Heidelberg (2006)
Chapter Google Scholar
Deng Cai, X.H., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW Image search results using visual, textual and link Information. In: The 12th annual ACM international conference on Multimedia, New York, NY, USA (2004)
Google Scholar
Besancon, P.H.R., Moellic, P.-A., Fluhr, C.: Cross-media feedback strategies: merging text and image information to improve image retrieval. In: Peters, C., Clough, P.D., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 709–717. Springer, Heidelberg (2005)
Google Scholar
Tjondronegoro, J.D.Z., Gu, J., Nguyen, A., Geva, S.: Integrating Text Retrieval and Image Retrieval in XML Document Searching. In: INEX 2005 (2006)
Google Scholar
Tapus, A., Vasudevan, S., Siegwart, R.: Towards a Multilevel Cognitive Probabilistic Representation of Space. In: Proceedings of SPIE (2005)
Google Scholar
Ma, W.-Y., Zhang, H.J.: Benchmarking of image features for content-based retrieval, Signals, Systems & Computers, 1998. Conference Record of the Thirty-Second Asilomar Conference 1, 253–257 (1998)
Google Scholar
Chai, D., bouzerdoum, A.: A Bayesian approach to skin color classification in YCbCr colorspace, presented at TENCON, Kuala Lumpur, Malaysia (2000)
Google Scholar
Gonzalez, R.C., Woods, R.E., Eddins, S.L.: Digital Image processing using MATLAB. Pearson Prentice Hall, Upper Saddle River, NJ (2004c)
Google Scholar
Kam, A.H., Ng, T.T., Kingsbury, N.G., Fitzgerald, W.J.: Content based image retrieval through object extraction andquerying, Content-based Access of Image and Video Libraries. In: Proceedings. IEEE Workshop, pp. 91–95 (2000)
Google Scholar
Duda, R.O., Hart, P.E.: Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM 15, 11–15 (1972)
Article Google Scholar
Gemert, J.C.v., Geusebroek, J.-M., Veenman, C.J., Snoek, C.G.M., Smeulders, A.W.M.: Robust scene categorization by learning image statistics in context. In: CVPR Workshop on Semantic Learning Applications in Multimedia, New York, USA, (2006)
Google Scholar
Zhang, D., Lu, G.: Evaluation of similarity measurement for image retrieval. In: Neural Networks and Signal Processing (2003)
Google Scholar
Google Image Labeler (accessed March 17, 2007), http://en.wikipedia.org/wiki/Google_Image_Labeler

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Queensland University of Technology, 2 George Street, GPO Box 2434, Brisbane, QLD 4001, Australia
C. Lau, D. Tjondronegoro, J. Zhang, S. Geva & Y. Liu

Authors

C. Lau
View author publications
You can also search for this author in PubMed Google Scholar
D. Tjondronegoro
View author publications
You can also search for this author in PubMed Google Scholar
J. Zhang
View author publications
You can also search for this author in PubMed Google Scholar
S. Geva
View author publications
You can also search for this author in PubMed Google Scholar
Y. Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Norbert Fuhr Mounia Lalmas Andrew Trotman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lau, C., Tjondronegoro, D., Zhang, J., Geva, S., Liu, Y. (2007). Fusing Visual and Textual Retrieval Techniques to Effectively Search Large Collections of Wikipedia Images. In: Fuhr, N., Lalmas, M., Trotman, A. (eds) Comparative Evaluation of XML Information Retrieval Systems. INEX 2006. Lecture Notes in Computer Science, vol 4518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73888-6_34

Download citation

DOI: https://doi.org/10.1007/978-3-540-73888-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73887-9
Online ISBN: 978-3-540-73888-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fusing Visual and Textual Retrieval Techniques to Effectively Search Large Collections of Wikipedia Images

Abstract

Chapter PDF

Similar content being viewed by others

Improving Text-Based Image Search with Textual and Visual Features Combination

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

CATIRI: An Efficient Method for Content-and-Text Based Image Retrieval

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Fusing Visual and Textual Retrieval Techniques to Effectively Search Large Collections of Wikipedia Images

Abstract

Chapter PDF

Similar content being viewed by others

Improving Text-Based Image Search with Textual and Visual Features Combination

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

CATIRI: An Efficient Method for Content-and-Text Based Image Retrieval

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation