Descriptor Learning for Efficient Retrieval

Philbin, James; Isard, Michael; Sivic, Josef; Zisserman, Andrew

doi:10.1007/978-3-642-15558-1_49

James Philbin¹⁹,
Michael Isard²¹,
Josef Sivic²⁰ &
…
Andrew Zisserman¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6313))

Included in the following conference series:

European Conference on Computer Vision

6019 Accesses
49 Citations

Abstract

Many visual search and matching systems represent images using sparse sets of “visual words”: descriptors that have been quantized by assignment to the best-matching symbol in a discrete vocabulary. Errors in this quantization procedure propagate throughout the rest of the system, either harming performance or requiring correction using additional storage or processing. This paper aims to reduce these quantization errors at source, by learning a projection from descriptor space to a new Euclidean space in which standard clustering techniques are more likely to assign matching descriptors to the same cluster, and non-matching descriptors to different clusters.

To achieve this, we learn a non-linear transformation model by minimizing a novel margin-based cost function, which aims to separate matching descriptors from two classes of non-matching descriptors. Training data is generated automatically by leveraging geometric consistency. Scalable, stochastic gradient methods are used for the optimization.

For the case of particular object retrieval, we demonstrate impressive gains in performance on a ground truth dataset: our learnt 32-D descriptor without spatial re-ranking outperforms a baseline method using 128-D SIFT descriptors with spatial re-ranking.

Download to read the full chapter text

Chapter PDF

SIFTpack: A Compact Representation for Efficient SIFT Matching

Feature Learning for the Image Retrieval Task

Deep Image Retrieval: Learning Global Representations for Image Search

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proc. CVPR (2006)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proc. CVPR (2007)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proc. ICCV (2003)
Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defence of nearest-neighbor based image classification. In: Proc. CVPR (2008)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: Proc. CVPR (2008)
Google Scholar
van Gemert, J., Geusebroek, J.M., Veenman, C., Smeulders, A.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)
Chapter Google Scholar
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: Proc. ICCV (2007)
Google Scholar
Schultz, M., Joachims, T.: Learning a distance metric from relative comparisons. In: NIPS (2003)
Google Scholar
Weinberger, K., Blitzer, J., Saul, L.: Distance metric learning for large margin nearest neighbor classification. In: NIPS (2005)
Google Scholar
Kumar, P., Torr, P., Zisserman, A.: An invariant large margin nearest neighbour classifier. In: Proc. ICCV (2007)
Google Scholar
Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: Proc. ICCV (2007)
Google Scholar
Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure. In: AI and statistics (2007)
Google Scholar
Mikolajczyk, K., Matas, J.: Improving descriptors for fast tree matching by optimal linear projection. In: Proc. ICCV (2007)
Google Scholar
Ramanan, D., Baker, S.: Local distance functions: A taxonomy, new algorithms, and an evaluation. In: Proc. ICCV (2009)
Google Scholar
Hua, G., Brown, M., Winder, S.: Discriminant embedding for local image descriptors. In: Proc. ICCV (2007)
Google Scholar
Snavely, N., Seitz, S., Szeliski, R.: Photo tourism: exploring photo collections in 3D. In: Proc. ACM SIGGRAPH, pp. 835–846 (2006)
Google Scholar
Lowe, D.: Object recognition from local scale-invariant features. In: Proc. ICCV (1999)
Google Scholar
http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/
http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. IJCV 1, 63–86 (2004)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Google Scholar
Guillamin, M., Verbeek, J., Schmid, C.: Is that you? Metric learning approaches for face identification. In: Proc. ICCV (2009)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Article MathSciNet Google Scholar
Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. In: NIPS (2007)
Google Scholar
Bray, M., Koller-Meier, E., Schraudolph, N.N., Van Gool, L.: Stochastic meta-descent for tracking articulated structures. In: Proc. CVPR (2004)
Google Scholar
Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a haystack. In: Proc. CVPR (2009)
Google Scholar
Perdoch, M., Chum, O., Matas, J.: Efficient representation of local geometry for large scale object retrieval. In: Proc. CVPR (2009)
Google Scholar
Winder, S., Hua, G., Brown, M.: Picking the best daisy. In: Proc. CVPR (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Visual Geometry Group, Department of Engineering Science, University of Oxford,
James Philbin & Andrew Zisserman
WILLOW, Laboratoire d’Informatique de l’Ecole Normale Superieure, INRIA, Paris
Josef Sivic
Microsoft Research, Silicon Valley
Michael Isard

Authors

James Philbin
View author publications
You can also search for this author in PubMed Google Scholar
Michael Isard
View author publications
You can also search for this author in PubMed Google Scholar
Josef Sivic
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Zisserman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Philbin, J., Isard, M., Sivic, J., Zisserman, A. (2010). Descriptor Learning for Efficient Retrieval. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6313. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15558-1_49

Download citation

DOI: https://doi.org/10.1007/978-3-642-15558-1_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15557-4
Online ISBN: 978-3-642-15558-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Descriptor Learning for Efficient Retrieval

Abstract

Chapter PDF

Similar content being viewed by others

SIFTpack: A Compact Representation for Efficient SIFT Matching

Feature Learning for the Image Retrieval Task

Deep Image Retrieval: Learning Global Representations for Image Search

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Descriptor Learning for Efficient Retrieval

Abstract

Chapter PDF

Similar content being viewed by others

SIFTpack: A Compact Representation for Efficient SIFT Matching

Feature Learning for the Image Retrieval Task

Deep Image Retrieval: Learning Global Representations for Image Search

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation