Abstract
Vector quantization (VQ) using exhaustive nearest neighbor (NN) search is the speed bottleneck in classic bag of visual words (BOV) models. Approximate NN (ANN) search methods still cost great time in VQ, since they check multiple regions in the search space to reduce VQ errors. In this paper, we propose ExVQ, an exclusive NN search method to speed up BOV models. Given a visual descriptor, a portion of search regions is excluded from the whole search space by a linear projection. We ensure that minimal VQ errors are introduced in the exclusion by learning an accurate classifier. Multiple exclusions are organized in a tree structure in ExVQ, whose VQ speed and VQ error rate can be reliably estimated. We show that ExVQ is much faster than state-of-the-art ANN methods in BOV models while maintaining almost the same classification accuracy. In addition, we empirically show that even with the VQ error rate as high as 30%, the classification accuracy of some ANN methods, including ExVQ, is similar to that of exhaustive search (which has zero VQ error). In some cases, ExVQ has even higher classification accuracy than the exhaustive search.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, vol. 2, pp. 524–531 (2005)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Wang, J., Yang, J., Yu, K., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2009)
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Empowering visual categorization with the GPU. IEEE Transaction on Multimedia 13, 60–70 (2011)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training example: an incremental bayesian approach tested on 101 object categories. In: CVPR (2004)
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP, vol. 1, pp. 331–340 (2009)
Bhattacharya, S., Sukthankar, R., Jin, R., Shah, M.: A probabilistic representation for efficient large scale visual tasks. In: CVPR, vol. 2, pp. 2593–2600 (2011)
Gersho, A., Gray, R.M.: Vector quantization and signal compression. Springer (1991)
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR, vol. 2, pp. 2161–2168 (2006)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)
Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. IEEE TPAMI 28, 1465–1479 (2006)
Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE TPAMI 30, 1632–1646 (2008)
Liu, T., Moore, A.W., Gray, A., Yang, K.: An investigation of practical approximate nearest neighbor algorithms. In: NIPS (2004)
Marszałek, M., Schmid, C.: Constructing Category Hierarchies for Visual Recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 479–491. Springer, Heidelberg (2008)
Gao, T., Koller, D.: Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: ICCV (2011)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry, pp. 253–262 (2004)
Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87, 316–336 (2010)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS, vol. 21, pp. 1753–1760 (2009)
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE TPAMI 33, 117–128 (2011)
Li, L.J., Fei-Fei, L.: What, where and who? Classifying events by scene and object recognition. In: ICCV, pp. 261–268 (2007)
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR, pp. 413–420 (2009)
Wu, J., Rehg, J.M.: CENTRIST: A visual descriptor for scene categorization. IEEE TPAMI 33, 1489–1501 (2011)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. IEEE TPAMI 34, 480–492 (2012)
Wu, J.: Power mean SVM for large scale visual classification. In: CVPR, pp. 2344–2351 (2012)
Wu, J., Rehg, J.M.: Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In: ICCV, pp. 630–637 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Y., Wu, J., Lin, W. (2013). Exclusive Visual Descriptor Quantization. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-37331-2_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37330-5
Online ISBN: 978-3-642-37331-2
eBook Packages: Computer ScienceComputer Science (R0)