Abstract
Recently, weighted k nearest neighbor based label prediction model combined with distance metric learning (KNN+ML) [10,14,17], has become more attractive and showed exciting results on image annotation task. Usually, in KNN+ML framework, a uniform distance metric is learned given a collection of similar/dissimilar image pairs from training data. Thus, for a couple of images, their distance is globally unique. However, this might not be sufficient for label prediction on annotation task because it is impossible to distinguish the multiple labels attached to each image. In this paper, we are motivated to learn multiple label-specific distance metrics, and measure the distance of an image pair under different labels’ distance metrics. We also propose a novel label specific prediction model, in which the weight of each label is determined by its specific distance value rather than previous global distance value. Compared with previous KNN+ML methods, our proposed method is able to exactly discriminate each label in each neighbor, and efficiently reduce the prediction of false positive and false negative labels. Extensive experimental results on three benchmark datasets demonstrate that proposed method achieves more accurate annotation results and competitive overall performance.
This work has been partly supported by Grant-in-Aid for Scientific Research (B), Grant Number 24300074.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2004)
Blei, D.M., Jordan, M.I.: Modeling annotated data. In: ACM SIGIR 2003 (2003)
Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on PAMI (2007)
Dai, L., Wang, X.J., Zhang, L., Yu, N.: Efficient tag mining via mixture modeling for real-time search-based image annotation. In: ICME (2012)
Putthividhya, D., Attias, H.T., Nagarajan, S.S.: Topic regression multi-modal latent dirichlet allocation for image annotation. In: CVPR (2010)
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: ICCV (2007)
Fu, H., Zhang, Q., Qiu, G.: Random forest for image annotation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 86–99. Springer, Heidelberg (2012)
Grubinger, M.: Analysis and Evaluation of Visual Information Systems Performance. Ph.D. thesis, Victoria University (2007)
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009)
Guillaumin, M., Verbeek, J., Schmid, C.: Is that you? metric learning approaches for face identification. In: ICCV (2009)
Huang, S.J., Zhou, Z.H.: Multi-label learning by exploiting label correlations locally. In: AAAI 2012 (2012)
Kostinger, M., Hirzer, M., Wohlhart, P., Roth, P., Bischof, H.: Large scale metric learning from equivalence constraints. In: CVPR (2012)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 488–501. Springer, Heidelberg (2012)
Torralba, A., Fergus, R., Freeman, W.: 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on PAMI (2008)
Verma, Y., Jawahar, C.V.: Image annotation using metric learning in semantic neighbourhoods. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 836–849. Springer, Heidelberg (2012)
Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: Arista - image search to annotation on billions of web photos. In: CVPR (2010)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: NIPS (2006)
Weinberger, K., Saul, L.: Fast solvers and efficient implementations for distance metric learning. In: ICML (2008)
Wu, P., Hoi, S.C.H., Zhao, P., He, Y.: Mining social images with distance metric learning for automated image tagging. In: WSDM (2011)
Xiang, Y., Zhou, X., Chua, T.S., Ngo, C.W.: A revisit of generative model for automatic image annotation using markov random fields. In: CVPR (2009)
Zhang, S., Huang, J., Huang, Y., Yu, Y., Li, H., Metaxas, D.: Automatic image annotation using group sparsity. In: CVPR (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, X., Shimada, A., Taniguchi, Ri. (2013). Image Annotation by Learning Label-Specific Distance Metrics. In: Petrosino, A. (eds) Image Analysis and Processing – ICIAP 2013. ICIAP 2013. Lecture Notes in Computer Science, vol 8156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41181-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-41181-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41180-9
Online ISBN: 978-3-642-41181-6
eBook Packages: Computer ScienceComputer Science (R0)