Abstract
This paper proposes a Gaussian Mixture Model (GMM)–based speech emotion recognition methods using four feature parameters; 1) Fast Fourier Transform(FFT) spectral entropy, 2) delta FFT spectral entropy, 3) Mel-frequency Filter Bank (MFB) spectral entropy, 4) delta MFB spectral entropy. In addition, we use four emotions in a speech database including anger, sadness, happiness, and neutrality. We perform speech emotion recognition experiments using each pre-defined emotion and gender. The experimental results show that the proposed emotion recognition using FFT spectral-based entropy and MFB spectral-based entropy performs better than existing emotion recognition based on GMM using energy, Zero Crossing Rate (ZCR), Linear Prediction Coefficient (LPC), and pitch parameters.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Gaussian Mixture Model
- Emotion Recognition
- Spectral Entropy
- Zero Crossing Rate
- Speech Emotion Recognition
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Goleman, D.: Emotional Intelligence. Bantam Books, New York (1995)
Borchert, M., Dusterhoft, A.: Emotion in Speech-Experiments with Prosody and Quality Features in Speech for Use in Categorical and Dimensional Emotion Recognition Environments, Natural Language Processing and Knowledge Engineering. In: Proceedings of 2005 IEEE International Conference on IEEE NLP-KE 2005, 30 October-1 November (2005)
Kim, S.-i., Lee, S.-h., Shin, W.-j., Park, N.-c.: Recognition of Emotional states in Speech using Hidden Markov Model. In: Proceeding of KFIS Fall Conference, vol. 14(2) (2004)
Zhao, L., Cao, Y., Wang, Z., Zou, C.: Speech Emotional Recognition Using Global and Time Sequence Structure Features with MMD. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784. Springer, Heidelberg (2005)
Schuller, B., Rigoll, G., Lang, M.: Hidden Markov Model-based speech emotion recognition. In: Proc. ICASSP, HongKong, China, pp. 401–404 (2003)
Hyun, K.H., Kim, E.H., Kwak, Y.K.: Improvement of Emotion Recognition by Bayesian Classifier Using Non-zero-pitch Cencept, Robot and Human Interactive Communication. In: IEEE International Workshop on ROMAN 2005, 13–15 August (2005)
Kwon, O.-W., Chan, K.-L., Hao, J., Lee, T.-W.: Emotion Recognition by Speech Signals. In: Eurospeech, Geneva, Switzerland (2003)
Wagner, J., Vogt, T., Andre, E.: A Systematic Comparison of Different HMM Design for Emotion Recognition from Acted and Spontaneous Speech. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds.) ACII 2007. LNCS, vol. 4738, pp. 114–125. Springer, Heidelberg (2007)
Zhou, J., Wang, G., Yang, Y., Chen, P.: Speech Emotion Recognition Based on Rough Set and SVM, Cognitive Informatics. In: 5th IEEE International Conference on ICCI 2006, July 17-19, 2006, vol. 1, pp. 53–61 (2006)
Young-Wan, R., Hong, K.-S.: Delta FBLC based Speech/Non-Speech Frame Decision in Real Car Environment. In: The 4th Conference on New Exploratory Technologies (Next 2007)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning - data mining, inference, and prediction. Springer, Heidelberg (2000)
Reynolds, D., Quatieri, T., Dunn., R.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)
Hyun, K.H., Kim, E.H., Kwak, Y.K.: Improvement of emotion recognition by Bayesian classifier using non-zero-pitch concept, Robot and Human Interactive Communication. In: IEEE International Workshop on ROMAN 2005, August 13-15, 2005, pp. 312–316 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, WS., Roh, YW., Kim, DJ., Kim, JH., Hong, KS. (2008). Speech Emotion Recognition Using Spectral Entropy. In: Xiong, C., Liu, H., Huang, Y., Xiong, Y. (eds) Intelligent Robotics and Applications. ICIRA 2008. Lecture Notes in Computer Science(), vol 5315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88518-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-88518-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88516-0
Online ISBN: 978-3-540-88518-4
eBook Packages: Computer ScienceComputer Science (R0)