Abstract
In this paper, we make full use of the complementarity between low-rank feature and part-based feature and present a novel method extracting discriminative parts with flexible number from low-rank features for action recognition. The proposed method avoids some intermediate processing steps (e.g., actor segmentation, body tracking) required by many traditional methods and can greatly avoid memorizing background information suffered by traditional part-based methods. In addition, traditional part-based methods usually set a fixed and identical number of discriminative parts for all action categories neglecting the differences of recognizing complexity among different action categories. On the contrary, we automatically extract discriminative parts with flexible number for each action category by introducing group sparse regularizer into our model, which is more reasonable and effective. In our method, we first extract low-rank features of all action sequences and transform them into corresponding low-rank images. Then, we densely sample each low-rank image into a large number of parts in multi-scale and represent each part into a feature vector. Afterward, our model automatically learn a set of discriminative part detectors with flexible number for each action category. We further define new similarity constraints to force the responses of detected parts from the same class more similar and consistent and that from different class more different. Finally, we define a corresponding recognition criterion to perform final action recognition. The efficacy of the proposed method is verified on three public datasets, and experimental results have shown the promising results of our method for human action recognition.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Zhang Z., Tao D.: Slow feature analysis for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 436–450 (2012)
Laptev, I.; Lindeberg, T.: Space-time interest points. In: IEEE International Conference on Computer Vision, pp. 432–439 (2003)
Ahad M.A.R., Tan J.K., Kim H., Ishikawa S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2012)
Wu D., Shao L.: Silhouette analysis-based action recognition via exploiting human poses. IEEE Trans. Circuits Syst. Video Technol. 23(2), 236–243 (2013)
Candamo, J.; Shreve, M.; Goldgof, D.B.; Sapper, D.B.; Kasturi, R.: Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans. Intell. Transp. Syst. 11(1), 206–224 (2010)
Weinland D., Ronfard R., Boyer E.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115(2), 224–241 (2011)
Guo G., Lai A.: A survey on still image based human action recognition. Pattern Recognit. 47(10), 3343–3361 (2014)
Huang S., Ye J., Wang T., Jiang L., Wu G., Li Y.: Extracting refined low-rank features of robust PCA for human action recognition. Arab. J. Sci. Eng. 40(5), 1427–1441 (2015)
Wang Y., Mori G.: Hidden part models for human action recognition: probabilistic versus max margin. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1310–1323 (2011)
Xie, Y.; Chang, H.; Li, Z.; Liang, L.; Chen, X.; Zhao, D.: A unified framework for locating and recognizing human actions. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 25–32 (2011)
Wang, C.; Wang, Y.; Yuille, A.: An approach to pose-based action recognition. In: Proceedings of 2013 IEEE International Conference on Computer Vision and Pattern Recognition, pp. 915–922 (2013)
Sapienza, M.; Cuzzolin, F.; Torr, P.: Learning discriminative space-time actions from weakly labelled videos. Int. J. Comput. 30–47 (2014)
Wang, L.; Qiao, Y.; Tang, X.: Motionlets: mid-level 3d parts for human motion recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, vol. 9, pp. 2674–2681 (2013)
Kovashka, A.; Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2046–2053 (2010)
Bilinski, P.; Bremond, F.: Contextual statistics of spacetime ordered features for human action recognition. In: IEEE International Conference on Advanced Video and Signal-Based Surveillance, pp. 228–233 (2012)
Yuan, C.; Li, X.; Hu, W.; Ling, H.; Maybank, S.: 3D R transform on spatio-temporal interest points for action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 724–730 (2013)
Samanta S., Chanda B.: Space-time facet model for human activity classification. IEEE Trans. Multimedia 16(6), 1525–1535 (2014)
Wu J., Hu D., Chen F.: Action recognition by hidden temporal models. Vis. Comput. 30, 1395–1404 (2014)
Yu J., Jeon M., Pedrycz W.: Weighted feature trajectories and concatenated bag-of-features for action recognition. Neurocomputing 131, 200–207 (2014)
Wang, H.; Klaser, A.; Schmid, C.; Liu, C.: Action recognition by dense trajectories. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3169–3176 (2011)
Jiang, Y.G.; Dai, Q.; Xue, X.; Liu, W.; Ngo, C.W.: Trajectory-based modeling of human actions with motion reference points. In: European Conference on Computer Vision, pp. 425–438 (2012)
Wang, H.; Schmid, C.: Action recognition with improved trajectories. In: IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Kliper-Gross, O.; Gurovich, Y.; Hassner, T.; Wolf, L.: Motion interchange patterns for action recognition in unconstrained videos. In: European Conference on Computer Vision, pp. 256–269 (2012)
Liu L., Shao L., Zhen X., Li X.: Learning discriminative key poses for action recognition. IEEE Trans. Cybern. 43(6), 1860–1870 (2013)
Liu J., Yang Y., Saleemi I., Shah M.: Learning semantic features for action recognition via diffusion maps. Comput. Vis. Image Underst. 116(3), 361–377 (2012)
Li, Y.; Ye, J.; Wang, T.; Huang, S.: Augmenting Bag-of-Words: A Robust Contextual Representation of Spatio-temporal Interest Points for Action Recognition. Vis. Comput. (2014). doi:10.1007/s00371-014-1020-8
Wu, X.; Xu, D.; Duan, L.; Luo, J.: Action recognition using context and appearance distribution features. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 489–496 (2011)
Zhao D., Shao L., Zhen X., Liu Y.: Combining appearance and structural features for human action recognition. Neurocomputing 113, 88–96 (2013)
Wang Y., Mori G.: Hidden part models for human action recognition: probabilistic versus max margin. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1310–1323 (2011)
Yao, B.; Jiang, X.; Khosla, A.; Lin, L.; Guibas, L.; Li, F.: Human action recognition by learning bases of action attributes and parts. In: IEEE International Conference on Computer Vision, pp. 1331–1338 (2011)
Xie, Y.; Chang, H.; Li, Z.; Liang, L.; Chen, X.; Zhao, D.: A uni ed framework for locating and recognizing human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 25–32 (2011)
Wang Y., Tran D., Liao Z., Forsyth D.: Discriminative hierarchical part-based models for human parsing and action recognition. J. Mach. Learn. Res. 13, 3075–3102 (2012)
Ma, S.; Zhang, J.; Ikizler-Cinbis, N.; Sclaroff, S.: Action recognition and localization by hierarchical space-time segments. In: IEEE International Conference on Computer Vision, pp. 2744–2751 (2013).
Sun, C.; Nevatia, R.: Discover: discovering important segments for classification of video events and recounting. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2569–2576 (2014)
Niebles, J.; Chen, C.; Li, F.: Modeling temporal structure of decomposable motion segments for activity classification. In: European Conference on Computer Vision, pp. 392–405 (2010)
Felzenszwalb P., Girshick R., McAllester D.: Object detection with discriminatively trained part based models. IEEE Trans. PAMI 32, 1627–1645 (2010)
Shapovalova, N.; Vahdat, A.; Cannons, K.; Lan, T.; Mori, G.: Similarity constrained latent support vector machine: an application to weakly supervised action classification. In: European Conference on Computer Vision, pp. 55–68, (2012)
Duchi, J.; Singer, Y.: Efficient learning using forward-backward splitting. In: Proceedings of 23rd Annual Conference on Neural Information Processing Systems, Vancouver, pp. 495–503 (2009)
Wang, J.; Yang, J.; Yu, K.; Lv, F.; Huang, T.; Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3360–3367 (2010)
Rodriguez, M.D.; Ahmed, J.; Shah, M.: Action mach: a spatio-temporal maximum average correlation height liter for action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Kuehne, H.; Jhuang, H.; Garrote, E.; Poggio, T.; Serre T.: HMDB: a large video database for human motion recognition. In: IEEE International Conference on Computer Vision, pp. 2556–2563 (2011)
Lee, W.; Chen, H.: Histogram-Based Interest Point Detectors. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1590–1596 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, S., Ye, J., Wang, T. et al. Extracting Discriminative Parts with Flexible Number from Low-Rank Features for Human Action Recognition. Arab J Sci Eng 41, 2987–3001 (2016). https://doi.org/10.1007/s13369-016-2042-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-016-2042-5