Abstract
Ranking models have recently been proposed for cascaded object detection, and have been shown to improve over regression or binary classification in this setting [1,2]. Rather than train a classifier in a binary setting and interpret the function post hoc as a ranking objective, these approaches directly optimize regularized risk objectives that seek to score highest the windows that most closely match the ground truth. In this work, we evaluate the effect of non-maximal suppression (NMS) on the cascade architecture, showing that this step is essential for high performance. Furthermore, we demonstrate that non-maximal suppression has a significant effect on the tradeoff between recall different points on the overlap-recall curve. We further develop additional objectness features at low computational cost that improve performance on the category independent object detection task introduced by Alexe et al. [3]. We show empirically on the PASCAL VOC dataset that a simple and efficient NMS strategy yields better results in a typical cascaded detection architecture than the previous state of the art [4.1]. This demonstrates that NMS, an often ignored stage in the detection pipeline, can be a dominating factor in the performance of detection systems.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Rahtu, E., Kannala, J., Blaschko, M.B.: Learning a category independent object detection cascade. In: Proc. ICCV (2011)
Zhang, Z., Warrell, J., Torr, P.H.S.: Proposal generation for object detection using cascaded ranking SVMs. In: Proc. CVPR (2011)
Alexe, B., Deselaers, T., Ferrari, V.: What is an object. In: Proc. CVPR (2010)
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE PAMI (2012)
Viola, P., Jones, M.: Robust real-time object detection. IJCV 1 (2001)
Romdhani, S., Torr, P., Schölkopf, B., Blake, A.: Computationally efficient face detection. In: Proc. ICCV (2001)
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: Proc. ICCV (2009)
Wu, J., Brubaker, S.C., Mullin, M.D., Rehg, J.M.: Fast asymmetric learning for cascade face detection. IEEE PAMI 30, 369–382 (2008)
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: A branch and bound framework for object localization. IEEE PAMI (2009)
Lehmann, A., Gehler, P., Van Gool, L.: Branch & rank: Non-linear object detection. In: Proc. BMVC (2011)
Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Smola, A.J., Bartlett, P.L., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 115–132. MIT Press (2000)
Bakir, G.H., Hofmann, T., Schölkopf, B., Smola, A.J., Taskar, B., Vishwanathan, S.V.N.: Predicting structured data. MIT Press (2007)
Blaschko, M.B., Vedaldi, A., Zisserman, A.: Simultaneous object detection and ranking with weak supervision. In: NIPS (2010)
Mittal, A., Blaschko, M.B., Zisserman, A., Torr, P.H.S.: Taxonomic multi-class prediction and person layout using efficient structured ranking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 245–258. Springer, Heidelberg (2012)
Barinova, O., Lempitsky, V., Kohli, P.: On the detection of multiple object instances using Hough transforms. In: Proc. CVPR (2010)
Blaschko, M.B.: Branch and bound strategies for non-maximal suppression in object detection. In: Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R. (eds.) EMMCVPR 2011. LNCS, vol. 6819, pp. 385–398. Springer, Heidelberg (2011)
McAllester, D.: Generalization bounds and consistency for structured labeling. In: Bakır, G.H., Hofmann, T., Schölkopf, B., Smola, A.J., Taskar, B., Vishwanathan, S.V.N. (eds.) Predicting Structured Data, pp. 247–261. MIT Press (2007)
Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 265–294 (1978)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Hyvärinen, A., Hurri, J., Hoyer, P.: Natural Image Statistics. Springer (2009)
Deselaers, T., Alexe, B., Ferrari, V.: Localizing objects while learning their appearance. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 452–466. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Blaschko, M.B., Kannala, J., Rahtu, E. (2013). Non Maximal Suppression in Cascaded Ranking Models. In: Kämäräinen, JK., Koskela, M. (eds) Image Analysis. SCIA 2013. Lecture Notes in Computer Science, vol 7944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38886-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-38886-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38885-9
Online ISBN: 978-3-642-38886-6
eBook Packages: Computer ScienceComputer Science (R0)