Abstract
Sounds or music usually occurs in an unstructured environment where their frequency varies from time to time. These temporal variations are one of the major problems in the music information retrieval. Additionally, polyphonic music or polyphony is simultaneous combination of two or more tones or melodic line, where each line is an independent melody of an instrument. As a result, identifying various instruments from recordings of polyphonic music is difficult and inaccurate using conventional methods. In this paper, a framework is presented for predominant instrument recognition in real-world polyphonic music. The framework consists of both convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNNs are used to pull out important features that are unchanging to local spectral and temporal variations. Similarly, RNNs are used as they quickly learn the long-term dependencies in the audio signals. The results obtained by the convolutional recurrent neural networks (CRNNs) showed an improved performance when compared to network built using only convolutional neural networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Han, Y., Kim, J., Lee, K.: Deep convolutional neural networks for predominant instrument recognition in polyphonic music. IEEE/ACM Trans. Audio Speech Lang. Process. 25(1), 208–221 (2016)
Li, P., Qian, J., Wang, T.: Automatic instrument recognition in polyphonic music using convolutional neural networks. arXiv preprint arXiv:1511.05520 (2015)
Cakır, E., Parascandolo, G., Heittola, T., Huttunen, H., Virtanen, T.: Convolutional recurrent neural networks for polyphonic sound event detection. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1291–1303 (2017)
Hossan, M.A., Memon, S., Gregory, M.A.: A novel approach for MFCC feature extraction. In: 2010 4th International Conference on Signal Processing and Communication Systems. IEEE (2010)
Vesperini, F., Gabrielli, L., Principi, E., Squartini, S.: Polyphonic sound event detection by using capsule neural networks. IEEE J. Select. Topics Signal Process. 13(2), 310–322 (2019)
Zuo, Z., Shuai, B., Wang, G., Liu, X., Wang, X., Wang, B., Chen, Y.: Convolutional recurrent neural networks: Learning spatial dependencies for image representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2015)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
O’Shea, Keiron, and Ryan Nash. “An introduction to convolutional neural networks.” arXiv preprint arXiv:1511.08458 (2015)
Albawi, S., Abed Mohammed, T., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET). IEEE (2017)
Aaron van den, O., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016)
Lichao, M., Ghamisi, P., Zhu, X.X.: Deep recurrent neural networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sensi. 55(7), 3639–3655 (2017)
Ghosh, S., Das, N., Nasipuri, M.: Reshaping inputs for convolutional neural network: Some common and uncommon methods. Pattern Recogn. 93, 79–94 (2019)
Brian, M., Raffel, C., Liang, D., Ellis, D.P.W., McVicar, M., Battenberg, E., Nieto, O.: librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, vol 8 (2015)
Bergstra, J.S., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. Adv. Neural Inform. Process. Syst. (2011)
Acknowledgements
It is our privilege to acknowledge with deep sense of gratitude and devotion for keen personal interest and invaluable guidance rendered by our Head of the Department, Dr. Suresh Pabboju, Professor, Department of Information Technology, Chaitanya Bharathi Institute of Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kilambi, B.R., Parankusham, A.R., Tadepalli, S.K. (2021). Instrument Recognition in Polyphonic Music Using Convolutional Recurrent Neural Networks. In: Pandian, A.P., Palanisamy, R., Ntalianis, K. (eds) Proceedings of International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1272. Springer, Singapore. https://doi.org/10.1007/978-981-15-8443-5_38
Download citation
DOI: https://doi.org/10.1007/978-981-15-8443-5_38
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8442-8
Online ISBN: 978-981-15-8443-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)