Attention Prediction in Egocentric Video Using Motion and Visual Saliency

Yamada, Kentaro; Sugano, Yusuke; Okabe, Takahiro; Sato, Yoichi; Sugimoto, Akihiro; Hiraki, Kazuo

doi:10.1007/978-3-642-25367-6_25

Kentaro Yamada¹⁷,
Yusuke Sugano¹⁷,
Takahiro Okabe¹⁷,
Yoichi Sato¹⁷,
Akihiro Sugimoto¹⁸ &
…
Kazuo Hiraki¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7087))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

2346 Accesses
29 Citations

Abstract

We propose a method of predicting human egocentric visual attention using bottom-up visual saliency and egomotion information. Computational models of visual saliency are often employed to predict human attention; however, its mechanism and effectiveness have not been fully explored in egocentric vision. The purpose of our framework is to compute attention maps from an egocentric video that can be used to infer a person’s visual attention. In addition to a standard visual saliency model, two kinds of attention maps are computed based on a camera’s rotation velocity and direction of movement. These rotation-based and translation-based attention maps are aggregated with a bottom-up saliency map to enhance the accuracy with which the person’s gaze positions can be predicted. The efficiency of the proposed framework was examined in real environments by using a head-mounted gaze tracker, and we found that the egomotion-based attention maps contributed to accurately predicting human visual attention.

Download to read the full chapter text

Chapter PDF

Video attention prediction using gaze saliency

Article 03 January 2017

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

Predicting Gaze in Egocentric Video by Learning Task-Dependent Attention Transition

Keywords

References

Avraham, T., Lindenbaum, M.: Esaliency (extended saliency): Meaningful attention using stochastic image modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(4), 693–708 (2010)
Article Google Scholar
Cerf, M., Harel, J., Einhäuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: Advances in Neural Information Processing Systems (NIPS), vol. 20, pp. 241–248 (2007)
Google Scholar
Costa, L.: Visual saliency and atention as random walks on complex networks. ArXiv Physics e-prints, arXiv:physics/0603025, pp. 1–6 (2006)
Google Scholar
Fischler, M., Bolles, R.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Foulsham, T., Underwood, G.: What can saliency models predict about eye movements? spatial and sequential aspects of fixations during encoding and recognition. Journal of Vision 8(2:6), 1–17 (2008)
Google Scholar
Fukuchi, M., Tsuchiya, N., Koch, C.: The focus of expansion in optical flow fields acts as a strong cue for visual attention. Journal of Vision 9(8), 137a (2009)
Article Google Scholar
Hansen, D., Ji, Q.: In the eye of the beholder: A survey of models for eyes and gaze. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(3), 478–500 (2010)
Article Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems (NIPS), vol. 19, pp. 545–552 (2006)
Google Scholar
Hartley, R.: In defense of the eight-point algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 19(6), 580–593 (1997)
Article Google Scholar
Hillaire, S., Lécuyer, A., Breton, G., Corte, T.R.: Gaze behavior and visual attention model when turning in virtual environments. In: Proceedings of the 16th ACM Symposium on Virtual Reality Software and Technology, VRST 2009, pp. 43–50. ACM, New York (2009)
Google Scholar
Hillaire, S., Lécuyer, A., Regia-Corte, T., Cozot, R., Royan, J., Breton, G.: A real-time visual attention model for predicting gaze point during first-person exploration of virtual environments. In: Proceedings of the 17th ACM Symposium on Virtual Reality Software and Technology, VRST 2010, pp. 191–198. ACM, New York (2010)
Google Scholar
Itti, L.: Quantitative modeling of perceptual salience at human eye position. Visual Cognition 14(4), 959–984 (2006)
Article Google Scholar
Itti, L., Baldi, P.F.: Bayesian surprise attracts human attention. In: Advances in Neural Information Processing Systems, NIPS 2005, vol. 19, pp. 547–554 (2006)
Google Scholar
Itti, L., Dhavale, N., Pighin, F., et al.: Realistic avatar eye and head animation using a neurobiological model of visual attention. In: SPIE 48th Annual International Symposiumon Optical Science and Technology, vol. 5200, pp. 64–78 (2003)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 20(11), 1254–1259 (1998)
Article Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE International Conference on Computer Vision (ICCV), pp. 2106–2113. IEEE (2009)
Google Scholar
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology 4(4), 219–227 (1985)
Google Scholar
Ma, Y., Hua, X., Lu, L., Zhang, H.: A generic framework of user attention model and its application in video summarization. IEEE Transactions on Multimedia 7(5), 907–919 (2005)
Article Google Scholar
nac Image Technology Inc.: Emr-9, http://www.nacinc.com/products/Eye-Tracking-Products/EMR-9/
Parkhurst, D., Law, K., Niebur, E.: Modeling the role of salience in the allocation of overt visual attention. Vision Research 42(1), 107–123 (2002)
Article Google Scholar
Qiu, X., Jiang, S., Liu, H., Huang, Q., Cao, L.: Spatial-temporal attention analysis for home video. In: IEEE International Conference on Multimedia and Expo (ICME 2008), pp. 1517–1520 (2008)
Google Scholar
Shi, J., Tomasi, C.: Good features to track. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 593–600. IEEE (1994)
Google Scholar
Tomasi, C., Kanade, T.: Detection and tracking of point features. Carnegie Mellon University Technical Report CMU-CS-91-132, pp. 1–22 (1991)
Google Scholar
Treisman, A., Gelade, G.: A feature-integration theory of attention. Cognitive Psychology 12(1), 97–136 (1980)
Article Google Scholar
Wang, W., Wang, Y., Huang, Q., Gao, W.: Measuring visual saliency by site entropy rate. In: Computer Vision and Pattern Recognition (CVPR), pp. 2368–2375. IEEE (2010)
Google Scholar
Yamada, K., Sugano, Y., Okabe, T., Sato, Y., Sugimoto, A., Hiraki, K.: Can saliency map models predict human egocentric visual attention? In: Proc. International Workshop on Gaze Sensing and Interactions (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo, Tokyo, Japan, 153-8505
Kentaro Yamada, Yusuke Sugano, Takahiro Okabe & Yoichi Sato
National Institute of Informatics, Tokyo, Japan, 101-8430
Akihiro Sugimoto
The University of Tokyo, Tokyo, Japan, 153-8902
Kazuo Hiraki

Authors

Kentaro Yamada
View author publications
You can also search for this author in PubMed Google Scholar
Yusuke Sugano
View author publications
You can also search for this author in PubMed Google Scholar
Takahiro Okabe
View author publications
You can also search for this author in PubMed Google Scholar
Yoichi Sato
View author publications
You can also search for this author in PubMed Google Scholar
Akihiro Sugimoto
View author publications
You can also search for this author in PubMed Google Scholar
Kazuo Hiraki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Gwangju Institute of Science and Technology (GIST), 1 Oryong-dong Buk-gu, 500-712, Gwangju, South Korea
Yo-Sung Ho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yamada, K., Sugano, Y., Okabe, T., Sato, Y., Sugimoto, A., Hiraki, K. (2011). Attention Prediction in Egocentric Video Using Motion and Visual Saliency. In: Ho, YS. (eds) Advances in Image and Video Technology. PSIVT 2011. Lecture Notes in Computer Science, vol 7087. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25367-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-25367-6_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25366-9
Online ISBN: 978-3-642-25367-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Attention Prediction in Egocentric Video Using Motion and Visual Saliency

Abstract

Chapter PDF

Similar content being viewed by others

Video attention prediction using gaze saliency

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

Predicting Gaze in Egocentric Video by Learning Task-Dependent Attention Transition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Attention Prediction in Egocentric Video Using Motion and Visual Saliency

Abstract

Chapter PDF

Similar content being viewed by others

Video attention prediction using gaze saliency

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

Predicting Gaze in Egocentric Video by Learning Task-Dependent Attention Transition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation