Abstract
The problem of ranking a set of visual samples according to their relevance to a query plays an important role in computer vision. The traditional approach for ranking is to train a binary classifier such as a support vector machine (svm). Binary classifiers suffer from two main deficiencies: (i) they do not optimize a ranking-based loss function, for example, the average precision (ap) loss; and (ii) they cannot incorporate high-order information such as the a priori correlation between the relevance of two visual samples (for example, two persons in the same image tend to perform the same action). We propose two novel learning formulations that allow us to incorporate high-order information for ranking. The first framework, called high-order binary svm (hob-svm), allows for a structured input. The parameters of hob-svm are learned by minimizing a convex upper bound on a surrogate 0-1 loss function. In order to obtain the ranking of the samples that form the structured input, hob-svm sorts the samples according to their max-marginals. The second framework, called high-order average precision svm (hoap-svm), also allows for a structured input and uses the same ranking criterion. However, in contrast to hob-svm, the parameters of hoap-svm are learned by minimizing a difference-of-convex upper bound on the ap loss. Using a standard, publicly available dataset for the challenging problem of action classification, we show that both hob-svm and hoap-svm outperform the baselines that ignore high-order information.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Behl, A., Jawahar, C.V., Kumar, M.P.: Optimizing average precision using weakly supervised data. In: CVPR (2014)
Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3D human pose annotations. IJCV (2009)
Delaitre, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: BMVC (2010)
Everingham, M., Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. IJCV (2010)
Finley, T., Joachims, T.: Training structural SVMs when exact inference is intractable. In: ICML (2008)
Franc, V., Savchynskyy, B.: Discriminative learning of max-sum classifiers. JMLR (2008)
Horst, R., Thoai, N.: DC programming overview. Journal of Optimization Theory and Applications (1999)
Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learning (2009)
Kohli, P., Torr, P.: Measuring uncertainty in graph cut solutions efficiently computing min-marginal energies using dynamic graph cuts. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. Part II. LNCS, vol. 3952, pp. 30–43. Springer, Heidelberg (2006)
Kohli, P., Torr, P.: Dynamic graph cuts for efficient inference in Markov random fields. PAMI (2007)
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. PAMI (2006)
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. PAMI (2004)
Komodakis, N.: Efficient training for pairwise or higher order crfs via dual decomposition. In: CVPR (2011)
Komodakis, N., Paragios, N., Tziritas, G.: MRF energy minimization and beyond via dual decomposition. PAMI (2011)
Kulesza, A., Pereira, F.: Structured learning with approximate inference. In: NIPS (2007)
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: CVPR (2011)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV (2001)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann (1988)
Rosenfeld, N., Meshi, O., Globerson, A., Tarlow, D.: Learning structured models with the AUC loss and its generalizations. In: AISTAT (2014)
Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: NIPS (2003)
Taskar, B., Lacoste-Julien, S., Jordan, M.I.: Structured prediction, dual extragradient and bregman projections. JMLR (2006)
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML (2004)
Vapnik, V.: Statistical learning theory. Wiley (1998)
Yue, Y., Finley, T., Radlinski, F., Joachims, T.: A support vector method for optimizing average precision. In: SIGIR (2007)
Yuille, A., Rangarajan, A.: The concave-convex procedure. Neural Computation (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dokania, P.K., Behl, A., Jawahar, C.V., Kumar, M.P. (2014). Learning to Rank Using High-Order Information. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8692. Springer, Cham. https://doi.org/10.1007/978-3-319-10593-2_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-10593-2_40
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10592-5
Online ISBN: 978-3-319-10593-2
eBook Packages: Computer ScienceComputer Science (R0)