Skip to main content

Instrument Recognition in Polyphonic Music Using Convolutional Recurrent Neural Networks

  • Conference paper
  • First Online:
Proceedings of International Conference on Intelligent Computing, Information and Control Systems

Abstract

Sounds or music usually occurs in an unstructured environment where their frequency varies from time to time. These temporal variations are one of the major problems in the music information retrieval. Additionally, polyphonic music or polyphony is simultaneous combination of two or more tones or melodic line, where each line is an independent melody of an instrument. As a result, identifying various instruments from recordings of polyphonic music is difficult and inaccurate using conventional methods. In this paper, a framework is presented for predominant instrument recognition in real-world polyphonic music. The framework consists of both convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNNs are used to pull out important features that are unchanging to local spectral and temporal variations. Similarly, RNNs are used as they quickly learn the long-term dependencies in the audio signals. The results obtained by the convolutional recurrent neural networks (CRNNs) showed an improved performance when compared to network built using only convolutional neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Han, Y., Kim, J., Lee, K.: Deep convolutional neural networks for predominant instrument recognition in polyphonic music. IEEE/ACM Trans. Audio Speech Lang. Process. 25(1), 208–221 (2016)

    Article  Google Scholar 

  2. Li, P., Qian, J., Wang, T.: Automatic instrument recognition in polyphonic music using convolutional neural networks. arXiv preprint arXiv:1511.05520 (2015)

  3. Cakır, E., Parascandolo, G., Heittola, T., Huttunen, H., Virtanen, T.: Convolutional recurrent neural networks for polyphonic sound event detection. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1291–1303 (2017)

    Article  Google Scholar 

  4. Hossan, M.A., Memon, S., Gregory, M.A.: A novel approach for MFCC feature extraction. In: 2010 4th International Conference on Signal Processing and Communication Systems. IEEE (2010)

    Google Scholar 

  5. Vesperini, F., Gabrielli, L., Principi, E., Squartini, S.: Polyphonic sound event detection by using capsule neural networks. IEEE J. Select. Topics Signal Process. 13(2), 310–322 (2019)

    Article  Google Scholar 

  6. Zuo, Z., Shuai, B., Wang, G., Liu, X., Wang, X., Wang, B., Chen, Y.: Convolutional recurrent neural networks: Learning spatial dependencies for image representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2015)

    Google Scholar 

  7. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  8. O’Shea, Keiron, and Ryan Nash. “An introduction to convolutional neural networks.” arXiv preprint arXiv:1511.08458 (2015)

  9. Albawi, S., Abed Mohammed, T., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET). IEEE (2017)

    Google Scholar 

  10. Aaron van den, O., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016)

  11. Lichao, M., Ghamisi, P., Zhu, X.X.: Deep recurrent neural networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sensi. 55(7), 3639–3655 (2017)

    Google Scholar 

  12. Ghosh, S., Das, N., Nasipuri, M.: Reshaping inputs for convolutional neural network: Some common and uncommon methods. Pattern Recogn. 93, 79–94 (2019)

    Article  Google Scholar 

  13. Brian, M., Raffel, C., Liang, D., Ellis, D.P.W., McVicar, M., Battenberg, E., Nieto, O.: librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, vol 8 (2015)

    Google Scholar 

  14. Bergstra, J.S., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. Adv. Neural Inform. Process. Syst. (2011)

    Google Scholar 

Download references

Acknowledgements

It is our privilege to acknowledge with deep sense of gratitude and devotion for keen personal interest and invaluable guidance rendered by our Head of the Department, Dr. Suresh Pabboju, Professor, Department of Information Technology, Chaitanya Bharathi Institute of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bhargav Ram Kilambi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kilambi, B.R., Parankusham, A.R., Tadepalli, S.K. (2021). Instrument Recognition in Polyphonic Music Using Convolutional Recurrent Neural Networks. In: Pandian, A.P., Palanisamy, R., Ntalianis, K. (eds) Proceedings of International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1272. Springer, Singapore. https://doi.org/10.1007/978-981-15-8443-5_38

Download citation

Publish with us

Policies and ethics