Abstract
Voice signals and facial expression changes are synchronized under the different emotions, the recognition algorithm based audio-visual feature fusion is proposed to identify emotional states more accurately. Prosodic features were extracted for speech emotional features, and local Gabor binary patterns were adopted for facial expression features. Two types of features were modeled with SVM respectively to obtain the probabilities of anger, disgust fear, happiness, sadness and surprise, and then fused the probabilities to gain the final decision. Simulation results demonstrate that the average recognition rates of the single modal classifier based on speech signals and based on facial expression reach 60% and 57% respectively, while the multimodal classifier with the feature fusion of speech signals and facial expression achieves 72%.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Jinjing, X., Yiqiang, C., Junfa, L.: Multi-expression Facial Animation based on Speech Emotion Recognition. Journal of Computer-aided Design & Computer Graphics 20(4), 520–525 (2008)
Kapoor, A., Picard, R.W.: Multimodal Affect Recognition in Learning Environments. In: Proc. of the 13th Annual International Conference on Multimedia, Singapore, pp. 677–682 (2005)
Danning, J., Lianhong, C.: Speech Emotion Recognition using Acoustic Features. J. Tsinghua Univ (Sci. & Tech.) 46(1), 86–89 (2006)
Koolagudi, S.G., Nandy, S., Rao, K.S.: Spectral Features for Emotion Classification. In: 2009 IEEE International Advance Computing Conference, Patiala, pp. 1292–1296 (2009)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multi-resolution Gray-scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 971–987 (2002)
Ahonen, T., Hadid, A., Pietikainen, M.: Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(12), 2037–2041 (2006)
Wenchao, Z., Shiguang, S., Hongming, Z.: Histogram Sequence of Local Gabor Binary Pattern for Face Description and Identification. Journal of Software 17(12), 2508–2517 (2006)
Kittler, J., Hatef, M., Duin, R.P.: On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tu, B., Yu, F. (2011). Bimodal Emotion Recognition Based on Speech Signals and Facial Expression. In: Wang, Y., Li, T. (eds) Foundations of Intelligent Systems. Advances in Intelligent and Soft Computing, vol 122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25664-6_81
Download citation
DOI: https://doi.org/10.1007/978-3-642-25664-6_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25663-9
Online ISBN: 978-3-642-25664-6
eBook Packages: EngineeringEngineering (R0)