Abstract
The volume of audio data is increasing tremendously daily on public networks like Internet. This increases the difficulty in accessing those audio data. Hence, there is a need of efficient indexing and annotation mechanisms. Non-stationarity and discontinuity present in the audio signal rise the difficulty in segmentation and classification of audio signals. The other challenging task is to extract and select the optimal features in audio signal. The application areas of audio classification and retrieval system include speaker recognition, gender classification, music genre classification, environment sound classification, etc. This paper proposes a machine learning- and neural network-based approach which performs audio pre-processing, segmentation, feature extraction, classification and retrieval of audio signal from the dataset. We have proposed novel approach of classification and retrieval using FPNN by combining fuzzy logic and PNN characteristics. We found that FPNN classifier gives better accuracy, F1-score and Kappa coefficient values compared to SVM, k-NN and PNN classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Castán, D., Tavarez, D., et al.: Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains. EURASIP J. Audio Speech Music Process. 33, 1–15 (2015)
Ludeña-Choez, J., Gallardo-Antolín, A.: Feature extraction based on the high-pass filtering of audio signals for acoustic event classification. J. Comput. Speech Lang. 30(1), 32–42 (2015)
Muthumari, A., Mala, K.: An efficient approach for segmentation, feature extraction and classification of audio signals. J. Circuits Syst. 7, 255–279 (2016)
Nagavi, T.C., Anusha, S.B., Monisha, P., Poornima, S.P.: Content based audio retrieval with MFCC feature extraction, clustering and sort-merge techniques. In: Proceedings of IEEE 4th International Conference on Computing, Communications and Networking Technologies, July 2013, pp. 1–6
Christopher Praveen Kumar, R., Suguna, S., Becky Elfreda, J.: Audio retrieval based on cepstral feature. Int. J. Comput. Appl. 107(17), 28–33 (2014). ISSN: 0975-8887
Al-Maathidi, M.M., Li, F.F.: NNET based audio content classification and indexing system. Int. J. Digit. Inf. Wirel. Commun. (IJDIWC) 2(4), 335–347 (2012). ISSN: 2225-658X
Srinivasa Murthy, Y., Koolagudi, S.G.: Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations. In: Proceedings of IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE), Halifax, May 2015, pp. 1271–1276
Zhang, X., Su, Z., Lin, P., He, Q., Yang, J.: An audio feature extraction scheme based on spectral decomposition. In: Proceedings of IEEE International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, July 2014, pp. 730–733
Haque, M.A., Kim, J.M.: An enhanced fuzzy C-means algorithm for audio segmentation and classification. Int. J. Multimed. Tools Appl. 63(2), 485–500 (2013)
Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, Oct 2013, pp. 1–4
Dhanalakshmi, P., Palanivel, S., Ramalingam, V.: Classification of audio signals using AANN and GMM. Appl. Soft Comput. 11(1), 716–723 (2011)
Riley, M., Heinen, E., Ghosh, J.: A text retrieval approach to content-based audio retrieval. In: Proceedings of ISMIR 9th International Conference on Music Information Retrieval, Sept 2008, pp. 295–300
Park, D.-C.: Content-based retrieval of audio data using a Centroid Neural Network. In: Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), South Korea, Dec 2010, pp. 394–398
Zahid, S., Hussain, F., Rashid, M., Yousaf, M.H., Habib, H.A.: Optimized audio classification and segmentation algorithm by using ensemble methods. Math. Problems Eng. 2015, 1–11 (2015). Article ID 209814
Mahana, Poonam, Singh, Gurbhej: Comparative analysis of machine learning algorithms for audio signals classification. Int. J. Comput. Sci. Netw. Secur. (IJCSNS) 15(6), 49–55 (2015)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5) (2002)
Miotto, R., Lanckriet, G.: A generative context model for semantic music annotation and retrieval. IEEE Trans. Audio Speech Lang. Process. 20(4), 1096–1108 (2012)
Haque, Mohammad A., Kim, Jong-Myon: An analysis of content-based classification of audio signals using a fuzzy c-means algorithm. J. Multimed. Tools Appl. 63(1), 77–92 (2013)
Dhabarde, S.V., Deshpande, P.S.: Feature extraction and classification of audio signal using local discriminant bases. Int. J. Ind. Electron. Electr. Eng. 3(5), 51–54 (2015). ISSN: 2347-6982
Baniya, B.K., Ghimire, D., Lee, J.: Automatic music genre classification using timbral texture and rhythmic content features. ICACT Trans. Adv. Commun. Technol. (TACT) 3(3), 434–443 (2014)
Kesavan Namboothiri, T., Anju, L.: Efficient audio retrieval using SVM and DTW techniques. Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) 23(2) (2016)
Rong, F.: Audio classification method based on machine learning. In: IEEE Proceedings of International Conference on Intelligent Transportation, Big Data & Smart City, pp. 81–84 (2016)
Kour, G., Mehan, N.: Music genre classification using MFCC, SVM and BPNN. Int. J. Comput. Appl. 112(6) (2015)
Hirvonen, T.: Speech/music classification of short audio segments. In: IEEE Proceedings of International Symposium on Multimedia, pp. 135–138 (2014)
Singh, M., Tiwary, U.S., Siddiqui, T.J.: A speech retrieval system based on fuzzy logic and knowledge-base filtering. In: IEEE Proceedings of International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), Nov 2013, pp. 46–50
GTZAN Dataset: http://marsyasweb.appspot.com/download/data_sets/
ESC-50 Dataset: https://github.com/karoldvl/ESC-50
Smith, S.W.: The Scientist and Engineer’s Guide to Digital Signal Processing, pp. 277–284
Sunitha, R.: Separation of unvoiced and voiced speech using zero crossing rate and short time energy. Int. J. Adv. Comput. Electron. Technol. (IJACET) 4(1), 6–9 (2017). ISSN: 2394-3416
Thiruvengatanadhan, R., Dhanalakshmi, P., Suresh Kumar, P.: Speech/music classification using SVM. Int. J. Comput. Appl. 65(6), 36–41 (2013). ISSN: 0975-8887
Radha Krishna, S., Rajeswara Rao, R.: SVM based emotion recognition using spectral features and PCA. Int. J. Pure Appl. Math. 114(9), 227–235 (2017). ISSN: 1314-3395
https://xpertsvision.wordpress.com/2015/12/04/gender-recognition-by-voice-analysis/
http://shodhganga.inflibnet.ac.in/bitstream/10603/150477/12/12_chapter%204.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Patil, N.M., Nemade, M.U. (2019). Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_23
Download citation
DOI: https://doi.org/10.1007/978-981-13-6861-5_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6860-8
Online ISBN: 978-981-13-6861-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)