Abstract
In this paper, we present a novel method for image annotation and made three contributions. Firstly, we propose to use the tags contained in the training images as the supervising information to guide the generation of random trees, thus enabling the retrieved nearest neighbor images not only visually alike but also semantically related. Secondly, different from conventional decision tree methods, which fuse the information contained at each leaf node individually, our method treats the random forest as a whole, and introduces the new concepts of semantic nearest neighbors (SNN) and semantic similarity measure (SSM). Thirdly, we annotate an image from the tags of its SNN based on SSM and have developed a novel learning to rank algorithm to systematically assign the optimal tags to the image. The new technique is intrinsically scalable and we will present experimental results to demonstrate that it is competitive to state of the art methods.
Chapter PDF
Similar content being viewed by others
References
Boiman, O., Shechtman, E., Irani, M.: In defense of Nearest-Neighbor based image classification. In: CVPR (June 2008)
Hays, J., Efros, A.A.: Scene completion using millions of photographs. In: SIGGRAPH, vol. 26 (July 2007)
Tighe, J., Lazebnik, S.: SuperParsing: Scalable Nonparametric Image Parsing with Superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 352–365. Springer, Heidelberg (2010)
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (September 2009)
Makadia, A., Pavlovic, V., Kumar, S.: A New Baseline for Image Annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV, vol. 2 (2003)
Wang, J., Kumar, S., Chang, S.F.: Semi-Supervised Hashing for Scalable Image Retrieval. In: CVPR (2010)
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Carneiro, G., Vasconcelos, N.: Formulating Semantic Image Annotation as a Supervised Learning Problem. In: CVPR (2005)
Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: Symposium on Theory of Computing (1998)
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: ICCV (September 2009)
Weiss, Y., Torralba, A., Fergus, R.: Spectral Hashing. In: NIPS, vol. (1) (2008)
Jain, P., Kulis, B., Grauman, K.: Fast Image Search for Learned Metrics. In: CVPR (June 2008)
Jia, Y., Wang, J., Zeng, G., Zha, H., Hua, X.S.: Optimizing kd-trees for scalable visual descriptor indexing. In: CVPR (2010)
Kumar, N., Zhang, L., Nayar, S.: What Is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images? In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 364–378. Springer, Heidelberg (2008)
Muja, M., Lowe, D.G.: Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In: VISAPP (2009)
Uijlings, J., Smeulders, A., Scha, R.: Real-time Bag of Words, Approximately. In: CIVR (2009)
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: NIPS (2006)
Fukui, M., Kato, N., Qi, W.: Multi-Class Labeling Improved by Random Forest for Automatic Image Annotation. In: IAPR Conference on Machine Vision Applications, pp. 202–205 (2011)
Fu, H., Qiu, G., He, H.: Feature Combination beyond Basic Arithmetics. In: British Machine Vision Conference (BMVC). BMVA (2011)
Bosch, A., Zisserman, A., Munoz, X.: Image Classification using Random Forests and Ferns. In: ICCV (October 2007)
Yao, B., Khosla, A., Fei-Fei, L.: Combining Randomization and Discrimination for Fine-Grained Image Categorization. In: CVPR (2011)
Yu, G., Yuan, J., Liu, Z.: Unsupervised Random Forest Indexing for Fast Action Search. In: CVPR (2011)
Schölkopf, B., Smola, A., Müller, K.R.: Kernel Principal Component Analysis. In: Gerstner, W., Hasler, M., Germond, A., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 583–588. Springer, Heidelberg (1997)
Zhang, K., Tsang, I.W., Kwok, J.T.: Improved Nystrom Low-Rank Approximation and Error Analysis. In: ICML (2008)
Criminisi, A., Shotton, J., Konukoglu, E.: Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning. Foundations and Trends in Computer Graphics and Vision 7(2-3), 81–227 (2012)
Hu, J., Lam, K.M., Qiu, G.: A Hierarchical Algorithm for Image Multi-labeling. In: ICIP (2010)
Joachims, T.: Training Linear SVMs in Linear Time. In: ACM KDD (2006)
Escalante, H.J., Hernández, C.A., Gonzalez, J.A.: The segmented and annotated IAPR TC-12 benchmark. Computer Vision and Image Understanding (April 2010)
Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: CVPR (2004)
Wang, C., Yan, S., Zhang, L., Zhang, H.J.: Multi-label sparse coding for automatic image annotation. In: CVPR (June 2009)
Zhou, N., Cheung, W., Qiu, G., Xue, X.: A Hybrid Probabilistic Model for Unified Collaborative and Content-Based Image Tagging. IEEE TPAMI 33, 1281–1294 (2011)
Liu, D., Yan, S., Rui, Y., Zhang, H.J.: Unified Tag Analysis With Multi-Edge Graph. In: ACM MM (2010)
Zhang, S., Huang, J., Huang, Y., Yu, Y., Li, H., Metaxas, D.: Automatic Image Annotation Using Group Sparsity. In: CVPR (2010)
Fu, H., Qiu, G.: Fast Semantic Image Retrieval Based on Random Forest. In: ACM MM (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, H., Zhang, Q., Qiu, G. (2012). Random Forest for Image Annotation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33783-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-33783-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33782-6
Online ISBN: 978-3-642-33783-3
eBook Packages: Computer ScienceComputer Science (R0)