Abstract
In this work we analyze and implement several audio features. We emphasize our analysis on the ZCR feature and propose a modification making it more robust when signals are near zero. They are all used to discriminate the following audio classes: music, speech, environmental sound. An SVM classifier is used as a classification tool, which has proven to be efficient for audio classification. By means of a selection heuristic we draw conclusions of how they may be combined for fast classification.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chai, W.: Semantic segmentation and summarization of music: methods based on tonality and recurrent structure. IEEE Signal Proc. Mag. 23(2), 124–132 (2006)
Chen, S.L., Gunduz, Ozsu, M.T.: Mixed type audio classification with support vector machine. In: IEEE International Conference on Multimedia and Expo, pp. 781–784 (July 2006)
Furui, S., Kikuchi, T., Shinnaka, Y., Hori, C.: Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Transactions on Speech and Audio Processing 12(4), 401–408 (2004)
Johnson, S.E., Woodland, P.C.: A method for direct audio search with applications to indexing and retrieval. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000, vol. 3, pp. 1427–1430 (2000)
Z., S., Lu, H.-J.Z.L., Li: Content-based audio segmentation using support vector machines. In: IEEE International Conference on Multimedia and Expo, ICME 2001, pp. 749–752 (August 2001)
Lu, L., Zhang, H.-J., Jiang, H.: Content analysis for audio classification and segmentation. IEEE Trans. on Speech and Audio Processing 10(7), 504–516 (2002)
Panagiotakis, C., Tziritas, G.: A speech/music discriminator based on rms and zero-crossings. IEEE Transactions on Multimedia 7(1), 155–166 (2005)
Park, A., Hazen, T.J., Glass, J.R.: Automatic processing of audio lectures for information retrieval: Vocabulary selection and language modeling. In: IEEE Int’l Conf. on Acoustics, Speech, and Signal Proc. (2005)
Sadjadi, S., Hansen, J.: Unsupervised speech activity detection using voicing measures and perceptual spectral flux. IEEE Signal Proc. Letters 20(3), 197–200 (2013)
Saunders, J.: Real-time discrimination of broadcast speech/music. In: IEEE Int’l Conf. on Acoustics, Speech, and Signal Proc., vol. 2, pp. 993–996 (1996)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc, New York (1995)
Zhang, C.-C.J.T., Kuo: Audio content analysis for online audiovisual data segmentation and classification. IEEE Transactions on Speech and Audio Processing 9(4), 441–457 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Bengolea, G., Acevedo, D., Rais, M., Mejail, M. (2014). Feature Analysis for Audio Classification. In: Bayro-Corrochano, E., Hancock, E. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2014. Lecture Notes in Computer Science, vol 8827. Springer, Cham. https://doi.org/10.1007/978-3-319-12568-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-12568-8_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12567-1
Online ISBN: 978-3-319-12568-8
eBook Packages: Computer ScienceComputer Science (R0)