Abstract
The amount of audio available in different databases on the Internet today is immense. Even systems that do allow searches for multimedia content, like AltaVista and Lycos, only allow queries based on the multimedia filename, nearby text on the web page containing the file, and metadata embedded in the file such as title and author. This might yield some useful results if the metadata provided by the distributor is extensive. Producing this data is a tedious manual task, and therefore automatic means for creating this information is needed. In this paper an algorithm to segment the given audio and extract the features such as MFCC, SF, SNR, ZCR is proposed and the experimental results shown for the given algorithm.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Peiszer, E., Lidy, T., Rauber, A.: Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music (2008)
Cook, G.T.P.: Multifeature Audio Segmentation for Browsing and Annotation. In: Proc.1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York, pp. W99-1–W99-4 (1999)
Lu, G.: Indexing and Retrieval of Audio: A Survey, pp. 269–290 (2001)
Zhang, J.X., Whalley, J., Brooks, S.: A Two Phase Method for general audio segmentation (2004)
Foote, J.: Automatic Audio Segmentation Using A Measure of Audio Novelty
Julien, P., José, A., Régine, A.: Audio classi_cation by search of primary components, pp. 1–12
Lu, L., Zhang, H.-J., Jiang, H.: Content Analysis for Audio Classification and Segmentation. IEEE Transaction on Speech and Audio Processing, 504–516 (2002)
Lu, L., Li, S.Z., Zhang, H.-J.: Content based audio segmentation using Support Vector Machines (2008)
Aguilo, M., Butko, T., Temko, A., Nadeu, C.: A Hierarchical Architecture for Audio Segmentation in a Broadcast News Task, pp. 17–20 (2009)
Cettolo, M., Vescovi, M., Rizzi, R.: Evaluation of BIC-based algorithms for audio segmentation, pp. 147–170. Elsevier (2005)
Goodwin, M.M., Laroche, J.: Audio Segmentation by feature space clustering using linear discriminant analysis and dynamic programming (2003)
Haque, M.A., Kim, J.-M.: An analysis of content-based classification of audio signals using a fuzzy c-means algorithm (2012)
Mesgarani, N., Slaney, M., Shamma, S.A.: Discrimination of Speech From Nonspeech Based on Multiscale Spectro-Temporal Modulations, pp. 920–930 (2006)
Krishnamoorthy, P., Kumar, S.: Hierarchical audio content classification system using an optimal feature selection algorithm, pp. 415–444 (2010)
Panagiotis, S., Vasileios, M., Ioannis, K., Hugo, M., Miguel, B., Isabel, T.: On the use of audio events for improving video scene segmentation
Abdallah, S., Sandler, M., Rhodes, C., Casey, M.: Using duration Models to reduce fragmentation in audio segmentation 65, 485–515 (2006)
Cheng, S.-S., Wang, H.-M., Fu, H.-C.: BIC-BASED Audio Segmentation by divide and conquer
Yong, S.: Audio Segmentation, pp. 1–4 (2007)
Matsunaga, S., Mizuno, O., Ohtsuki, K., Hayashi, Y.: Audio source segmentation using spectral correlation features for automatic indexing of broadcast news, pp. 2103–2106
Sainath, T.N., Kanevsky, D., Iyengar, G.: Uusupervised audio segmentation using extended Baum-Welch Transformations, I 209-I 212 (2007)
Giannakopoulos, T., Pikrakis, A., Theodoridis, S.: A Novel Efficient Approach for Audio Segmentation (2008)
Zhang, Y., Zhou, J.: Audio Segmentation based on Multiscale audio classification, pp. IV-349–IV-352 (2004)
Peng, Y., Ngo, C.-W., Fang, C., Chen, X., Xiao, J.: Audio Similarity Measure by Graph Modeling and Matching, pp. 603–606
Harchaoui, Z., Vallet, F., Lung-Yut-Fong, A., Cap, O.: Regularized Kernel-Based ApproachToUnsupervised Audio Segmentation
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Bhandari, G.M., Kawitkar, R.S., Borawake, M.P. (2014). Audio Segmentation for Speech Recognition Using Segment Features. In: Satapathy, S., Avadhani, P., Udgata, S., Lakshminarayana, S. (eds) ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India- Vol II. Advances in Intelligent Systems and Computing, vol 249. Springer, Cham. https://doi.org/10.1007/978-3-319-03095-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-03095-1_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03094-4
Online ISBN: 978-3-319-03095-1
eBook Packages: EngineeringEngineering (R0)