Abstract
Object detection has seen a surge of interest in recent years, which has lead to increasingly effective techniques. These techniques, however, still mostly perform detection based on local evidence in the input image. While some progress has been made towards exploiting scene context, the resulting methods typically only consider a single image at a time. Intuitively, however, the information contained jointly in multiple images should help overcoming phenomena such as occlusion and poor resolution. In this paper, we address the co-detection problem that aims to leverage this collective power to achieve object detection simultaneously in all the images of a set. To this end, we formulate object co-detection as inference in a fully-connected CRF whose edges model the similarity between object candidates. We then learn a similarity function that allows us to efficiently perform inference in this fully-connected graph, even in the presence of many object candidates. This is in contrast with existing co-detection techniques that rely on exhaustive or greedy search, and thus do not scale well. Our experiments demonstrate the benefits of our approach on several co-detection datasets.
NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy, as well as by the Australian Research Council through the ICT Centre of Excellence program.
Chapter PDF
Similar content being viewed by others
References
Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR (2010)
Baek, J., Adams, A., Dolson, J.: Lattice-based high-dimensional gaussian filtering and the permutohedral lattice. JMIV (2013)
Bao, S.Y., Xiang, Y., Savarese, S.: Object co-detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 86–101. Springer, Heidelberg (2012)
Barinova, O., Lempitsky, V., Kohli, P.: On detection of multiple object instances using hough transforms. PAMI (2012)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. PAMI 23(11) (2001)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR (2009)
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: ICCV (2009)
Dickinson, S.J., Leonardis, A., Schiele, B., Tarr, M.J.: Object Categorization: Computer and Human Vision Perspectives. Cambridge University Press (2009)
Endres, I., Hoiem, D.: Category independent object proposals. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 575–588. Springer, Heidelberg (2010)
Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: International Conference on Computer Vision, ICCV 2007 (October 2007)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV (2010)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Discriminatively trained deformable part models, release 4, http://people.cs.uchicago.edu/~pff/latent-release4/
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE PAMI (2010)
Galleguillos, C., Belongie, S.: Context based object categorization: A critical survey. CVIU 114, 712–722 (2010)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. ArXiv 1311.2524 (2013)
Gönen, M., Alpaydın, E.: Multiple kernel learning algorithms. JMLR 12 (2011)
Guo, X., Liu, D., Jou, B., Zhu, M., Cai, A., Chang, S.F.: Robust Object Co-detection. In: CVPR (2013)
Hoiem, D., Efros, A.A., Hebert, M.: Putting Objects in Perspective. IJCV 80, 3–15 (2008)
Kolmogorov, V., Zabin, R.: What energy functions can be minimized via graph cuts? IEEE Trans PAMI 26(2) (2004)
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: NIPS (2011)
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV (1999)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. PAMI 24(7) (2002)
Shi, J., Liao, R., Jia, J.: CoDeL: An efficient human co-detection and labeling framework. In: ICCV (2013)
Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. IJCV (2013)
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)
Vineet, V., Warrell, J., Torr, P.H.S.: Filter-based mean-field inference for random fields with higher-order terms and product label-spaces. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 31–44. Springer, Heidelberg (2012)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2) (2004)
Weinberger, K., Blitzer, J., Saul, L.: Distance metric learning for large margin nearest neighbor classification. In: NIPS (2006)
Zhang, Y., Chen, T.: Efficient inference for fully-connected crfs with stationarity. In: CVPR (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Hayder, Z., Salzmann, M., He, X. (2014). Object Co-detection via Efficient Inference in a Fully-Connected CRF. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8691. Springer, Cham. https://doi.org/10.1007/978-3-319-10578-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-10578-9_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10577-2
Online ISBN: 978-3-319-10578-9
eBook Packages: Computer ScienceComputer Science (R0)