Visual Recognition with Humans in the Loop

Branson, Steve; Wah, Catherine; Schroff, Florian; Babenko, Boris; Welinder, Peter; Perona, Pietro; Belongie, Serge

doi:10.1007/978-3-642-15561-1_32

Steve Branson¹⁹,
Catherine Wah¹⁹,
Florian Schroff¹⁹,
Boris Babenko¹⁹,
Peter Welinder²⁰,
Pietro Perona²⁰ &
…
Serge Belongie¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6314))

Included in the following conference series:

European Conference on Computer Vision

13k Accesses
129 Citations

Abstract

We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

Download to read the full chapter text

Chapter PDF

Contributions of Shape, Texture, and Color in Visual Recognition

Can computer vision problems benefit from structured hierarchical classification?

Article Open access 06 May 2016

VQA: Visual Question Answering

Article 08 November 2016

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL VOC Challenge 2009 Results (2009)
Google Scholar
Nister, D., Stewenius, H.: Recognition with a vocabulary tree. In: CVPR (2006)
Google Scholar
Nilsback, M., Zisserman, A.: Automated flower classification over a large number of classes. In: Indian Conf. on Comp. Vision, Graphics & Image Proc., pp. 722–729 (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: A maximum entropy framework for part-based texture and object recognition. In: ICCV, vol. 1, pp. 832–838 (2005)
Google Scholar
Martınez-Munoz, et al.: Dictionary-free categorization of very similar objects via stacked evidence trees. In: CVPR (2009)
Google Scholar
Belhumeur, P., Chen, D., Feiner, S., Jacobs, D., Kress, W., Ling, H., Lopez, I., Ramamoorthi, R., Sheorey, S., White, S., Zhang, L.: Searching the world’s herbaria: A system for visual identification of plant species. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 116–129. Springer, Heidelberg (2008)
Chapter Google Scholar
Zhou, X., Huang, T.: Relevance feedback in image retrieval: A comprehensive review. Multimedia Systems 8, 536–544 (2003)
Article Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. JMLR 2, 45–66 (2002)
Article MATH Google Scholar
Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Active learning with gaussian processes for object categorization. In: ICCV, pp. 1–8 (2007)
Google Scholar
Holub, A., Perona, P., Burl, M.: Entropy-based active learning for object recognition. In: Workshop on Online Learning for Classification (OLC), pp. 1–8 (2008)
Google Scholar
Neapolitan, R.E.: Probabilistic reasoning in expert systems: theory and algorithms. John Wiley & Sons, Inc., New York (1990)
Google Scholar
Beynon, M., Cosker, D., Marshall, D.: An expert system for multi-criteria decision making using Dempster Shafer theory. Expert Systems with Applications 20 (2001)
Google Scholar
Tsang, S., Kao, B., Yip, K., Ho, W., Lee, S.: Decision trees for uncertain data. In: International Conference on Data Engineering, ICDE (2009)
Google Scholar
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Dembo, A., Cover, T., Thomas, J.: Information theoretic inequalities. IEEE Transactions on Information Theory 37, 1501–1518 (1991)
Article MATH MathSciNet Google Scholar
Sivic, J., Russell, B., Zisserman, A., Freeman, W., Efros, A.: Unsupervised discovery of visual object class hierarchies. In: CVPR, pp. 1–8 (2008)
Google Scholar
Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: CVPR, pp. 1–8 (2008)
Google Scholar
Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: CVPR, vol. 2 (2004)
Google Scholar
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)
MATH Google Scholar
Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)
Google Scholar
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and Simile Classifiers for Face Verification. In: ICCV (2009)
Google Scholar
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)
Google Scholar
Platt, J.: Probabilities for SV machines. In: NIPS, pp. 61–74 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, San Diego
Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko & Serge Belongie
California Institute of Technology,
Peter Welinder & Pietro Perona

Authors

Steve Branson
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Wah
View author publications
You can also search for this author in PubMed Google Scholar
Florian Schroff
View author publications
You can also search for this author in PubMed Google Scholar
Boris Babenko
View author publications
You can also search for this author in PubMed Google Scholar
Peter Welinder
View author publications
You can also search for this author in PubMed Google Scholar
Pietro Perona
View author publications
You can also search for this author in PubMed Google Scholar
Serge Belongie
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

1 Electronic Supplementary Material

Electronic Supplementary Material (3,168 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Branson, S. et al. (2010). Visual Recognition with Humans in the Loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15561-1_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-15561-1_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15560-4
Online ISBN: 978-3-642-15561-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Visual Recognition with Humans in the Loop

Abstract

Chapter PDF

Similar content being viewed by others

Contributions of Shape, Texture, and Color in Visual Recognition

Can computer vision problems benefit from structured hierarchical classification?

VQA: Visual Question Answering

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (3,168 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Visual Recognition with Humans in the Loop

Abstract

Chapter PDF

Similar content being viewed by others

Contributions of Shape, Texture, and Color in Visual Recognition

Can computer vision problems benefit from structured hierarchical classification?

VQA: Visual Question Answering

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Electronic Supplementary Material (3,168 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation