Identification of Musical Instruments Using MFCC Features

Gulhane, Sushen R.; Shirbahadurkar Suresh, D.; Badhe Sanjay, S.

doi:10.1007/978-3-030-41862-5_97

Sushen R. Gulhane⁵,
D. Shirbahadurkar Suresh⁶ &
S. Badhe Sanjay⁵

64 Accesses
1 Citations

Abstract

The general aim of our study and this research is to find out better classifier for musical device identification with great accuracy. This is one of the most popular topics for study. In our research paper, we present the idea to identify the musical instrument from a monophonic audio signal. For this purpose, we have used Cepstral features (i.e. MFCC features) extraction technique for extraction of features and there is the number of classifiers out of which, we have used SVM and KNN classifiers for sorting purpose. We have compared the results from both classifiers. In our work, we have made a catalog of different music samples from various musical instruments. We use this catalog for both training and testing purpose.

Access provided by Autonomous University of Puebla. Download chapter PDF

Musical Instrument Classification Based on Machine Learning Algorithm

Improvement of Audio Feature Extraction Techniques in Traditional Indian Musical Instrument

Automatic musical instrument classification using fractional fourier transform based- MFCC features and counter propagation neural network

Article 13 May 2015

Keywords

1 Introduction

We have worked on the most popular area of research i.e. musical instrument identification. In recent days most of the musical data is available in the digital form. In such cases, automatic musical instrument identification becomes more important and popular. Monophonic, polyphonic, homophonic are the different types of music signals [1]. Where the monophonic type of music signal means only one musical instrument is playing at that time and the audio sound signal is of only one instrument playing a single melody. Our work deals with the identification of musical instrument wherein the audio sound file is of the monophonic type which is the texture of the music. Similar musical instruments are having almost similar kind of features. A group of sound samples is taken. Features of these group are extracted using different feature extraction techniques. There are many feature extraction techniques to extract features of audio sound signals [2]. In our work, we are using MFCC feature extraction method. Some algorithms are used to compare various features from audio samples. These algorithms are called as classifiers. In our work, K-Nearest Neighbor (KNN) and Support Vector Machine (SVM), these two classifiers have been used to identify the musical instrument.

Our objectives for the research are to recognize musical instrument from an audio signal and to check which classifier can give better identification. In this whole work of identification, there are three tasks. First one is pre-processing of the input music signal, the second one is feature extraction from sound signals and the last one is recognition of instrument [3].

2 System Overview

Our proposed system of musical instrument identification divides and flow in two phases, training phase flow, and the testing phase flow as shown in Fig. 1. In the training phase, different audio samples of musical instruments are fed as input to the musical instrument identification system. MFCC features are extracted from these audio samples by using MFCC feature extraction technique. Those features are stored in an array-matrix format in excel which is called as features array. These extracted features are fed to the system for training purpose using specified classifier for further classification process [4]. Similarly, in the testing phase flow, an unknown audio sound signal file is given as an input to the system. Then the MFCC features are extracted of the same audio sound signal file and results are compared with the reference results obtained in the training phase flow and the new signal is then classified by using the same classifier [5]. The purpose of the proposed work is to find out better classifier and combinations of feature extraction methods with classifiers. In the first step, the musical instruments sound database is created. Finally, the input sample is compared with reference samples to find out the type of instrument.

There are different machine learning techniques that have been used for musical instrument identification [6]. Two main types of learning techniques are Supervised and Unsupervised. In a supervised learning technique, there is a set of audio sound signals for the purpose of reference called the training vector. In this method, final resultant outputs are available and we have to get the output almost the same as a target output. In unsupervised learning, outputs are finding out with the help of inputs only. There are no target outputs available. In our system, we are using a supervised learning technique.

Preprocessing is the first step in instrument identification which is used to increase the efficiency of overall system performance [7]. In a preprocessing step, it contains two steps of reading sounds and noise removal. Noise removal is used to increase the strength of the signal and also to increase the accuracy of recognition [8]. Next part of the project is feature extraction and then classification. There are a lot of feature extraction techniques. There are different types of features for musical instrument identification. A study is to be carried out for the specific features selection and then it will be used. In this paper, the MFCC feature extraction method is used. For the identification of musical instruments, the need is to extract meaningful information. The audio signal will be consisting of a lot of information where we will not be needing all the information for the identification. So by using feature extraction methods, meaningful information is to be extracted and will be used for further processing [9]. Most of the time, it happens that audio data to be provided to the classifier is too large with less information, which reduces the efficiency of the classifier, in that case, meaningful data from the audio data samples are to be extracted and used. Mostly that will be the collection of no. Of features hence called a feature vector [10, 11]. These sets of information related to features are then used to identify the musical instrument. Further, in the third step, the recognition process is done by using classifiers.

3 System Design Methodology

Here we are working on musical instrument identification where monophonic audio samples have been considered along with the MFCC feature extraction method, two different classifiers, and the database system. We have given the details of the different techniques used in this project work.

3.1 Mel-Frequency Cepstral Coefficients (MFCC)

There are a lot of features to be extracted from the audio sample out of which MFCC features give the best result for the identification. MFCCs always represent spectral coefficients in a sound. Each coefficient is having a specific value for each part of the sound signal. Now a day MFCC is widely used in instrument and speaker identification systems. Figure 2 presents the block diagram for the detail process of extracting MFCC features.

3.1.1 Database

The database is a well-maintained collection of information which is organized in a particular way so that a program can quickly and easily select any required part of data [12]. In our work, we have to make a database of audio samples for different musical instruments like piano, sitar, guitar, etc. In this work, we are using an in-house database so that we can record any number of data for each instrument. Audio samples of different instruments are recorded in .wav format with the same size. From the recording sample database, 70% database is used for training purpose and 30% database is used for testing purpose.

3.1.2 Support Vector Machine (SVM)

There are no. of classifiers to classify the data out of which, SVM is a classifier which comes under supervised learning techniques. SVM is used in various applications like face, character, handwriting recognition. To transform the data it uses a technique which is called the kernel trick. SVM finds the minimum boundary between the possible outputs using those data transformations. Though SVM design is complex but it gives the best results as compared to other classifiers. SVM classifier algorithm process is as shown by Fig. 3.

3.1.3 K-Nearest Neighbor Algorithm

In K-nearest neighbor algorithm it finds objects of the same nature and makes a grouping of them. In K-NN algorithm process objects of the same category or having the same features should be closer in distance. It is a type of instance-based learning technique and it gives the class of a new test data based on the data samples in the database. Out of various machine learning algorithm, KNN is the easiest machine learning algorithm. The flowchart for the K-NN algorithm is as shown in Fig. 4.

4 Results

We have used MFCC for feature extraction and following figures showing some results on input sound signal in the process of converting it into Mel frequency spectrum. Figure 5 shows the input signal waveform which is in the form of sound.

Figure 6 shows the effect of pre-emphasis that is nothing but noise removal on input sound signal. S(n) represent the input sound signal, n be the time coefficient, and α = 0.97.

$$ \hat{\mathrm{S}}\left(\mathrm{n}\right)=\mathrm{S}\left(\mathrm{n}\right)-\upalpha .\mathrm{S}\left(\mathrm{n}-1\right) $$

(1)

Figure 7 shows hamming window with 64 samples. A Hamming window is A = πr² generally used to prevent dramatic changes in a window, as follows:

$$ \mathrm{W}\left(\mathrm{n}\right)=\left\{\begin{array}{ll}0.54-0.46\times \cos \left(\frac{2\mathrm{n}\uppi}{\mathrm{N}-1}\right),& 0\le \mathrm{n}\le \mathrm{N}-1\\ {}0,& \mathrm{otherwise}\end{array}\right. $$

(2)

To remove additional high-frequency signals on both sides of frames of sound signals window processing is used. It highlights major signals at the center of the frame. To better observe sound signal characteristics, we plot different slots in a signal like frames which are generally with 50%. It is referred to as framing as shown in Fig. 8.

Figure 9 shows tri-filter bank signal. It reduces the frequency scale. Triangular filter bank consists of a no. of triangular bandpass filters.

Discrete cosine transform is used to transform frequency to the time-domain signal. It results in Mel Frequency Cepstrum as shown in Fig. 10 by using log filter bank energies as shown in Fig. 11. In the identification system of sound, the most commonly used simple frequency scale transformation equation is Mel frequency which is shown below.

$$ \mathrm{Mel}=2595\times \log \left(1+\frac{\mathrm{f}}{700}\right) $$

(3)

We have made a Graphical User Interface for our recognition system which gives different results related to MFCC in the form of waveforms as shown in Fig. 12. It contains all related parameters which we have used in the process of feature extraction and classification. By using Run code button we can recognize the instrument played in the selected sound file. Reset button clears all previous output waveforms. The final recognition result which is the name of the instrument has displayed by the GUI after the testing phase.

K-NN algorithm gives various results for various values of K and the number of instruments. Tables 1, 2, and 3 gives confusion matrices for the K-NN algorithm. Table 1 gives various accuracy percentage values for different instruments with the value of K is 5. Table 1 gives less accuracy as compared to Tables 2 and 3. Table 2 gives various accuracy percentage values for different instruments with the value of K is 3. Table 2 gives less accuracy as compared to Table 3 and more than Table 1. Table 3 gives various accuracy percentage values for different instruments with the value of K is 1. Table 3 gives maximum accuracy as compared to Table 1 and 2. It gives 100% accuracy for various numbers of instruments.

Table 1 Confusion Matrix for K-NN algorithm with K = 5

Full size table

Table 2 Confusion Matrix for K-NN algorithm with K = 3

Full size table

Table 3 Confusion Matrix for K-NN algorithm with K = 3

Full size table

5 Conclusion

To implement Musical Instrument Identification System, we have performed a lot of experimentation considering MFCC feature extraction techniques and SVM & KNN as classifiers and concluded that MFCC gives better performance with SVM classifier as compared to K-NN classifiers.

The system gives better accuracy in identification of musical instrument and status finding. The future work of this system is to use more audio features, such as PLP or Wavelet Transform and to make instrument recognition for polyphonic music.

References

Athanasia Zlatintsi and Petros Maragos “Multiscale Fractal Analysis of Musical Instrument Signals With Application to Recognition”, IEEE Transactions On Audio, Speech, And Language Processing, Vol. 21, No. 4, April 2013.
Google Scholar
S. R. Sankaye and U. S. Tandon, “Indian Musical Instrument Recognition based MFCC Feature Set”, IOSR Journal of Computer Engineering (IOSR-JCE).
Google Scholar
Naktode Dipali Ravi and Daulappa G. Bhalke “Musical Instrument Information Retrieval Using Neural Network”, 2016 IEEE International Conference on Advances in Electronics, Communication and Computer Technology (ICAECCT).
Google Scholar
Miss. Karishma Vichare 1, Prof. S. R. Gulhane 2 “MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC”, International Journal of Advance Engineering and Research Development, Volume 5, Issue 04, April-2018.
Google Scholar
Priyanka S. Jadhav, “Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors”, International Journal on Recent and Innovation Trends in Computing and Communication, July 2015 Volume 3 Iss. 7.
Google Scholar
S. R. Sankaye and U. S. Tandon, “Human Perception and Recognition of Musical Instruments”.
Google Scholar
Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Of Computing, March 2010, ISSN 2151-9617
Google Scholar
Babak Toghiani-Rizi and Marcus Windmark “Musical Instrument Recognition Using Their Distinctive Characteristics in Artificial Neural Networks”, May 2017.
Google Scholar
A. S. Patankar, “Musical Instrument Recognition Techniques, IJRBAT, Jan. 2017.
Google Scholar
Antti Eronen and Anssi Klapuri, “Musical Instrument Recognition Using Cepstral Coefficients and Temporal Features”, P.O. Box 553, FIN-33101 Tampere, Finland.
Google Scholar
Monica S. Nagawade, Varsha R. Ratnaparkhe, “Musical Instrument Identification using MFCC”, 2017 2nd IEEE International Conference On Recent Trends in Electronics Information & Communication Technology (RTEICT), May 19–20, 2017, India.
Google Scholar
Jeremiah D. Deng, Member, IEEE, Christian Simmermacher, and Stephen Cranefield,” A Study on Feature Analysis for Musical Instrument Classification”, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 38, NO. 2, APRIL 2008.
Google Scholar

Download references

Author information

Authors and Affiliations

DYPIT, DYPCOE (SPPU), Pune, India
Sushen R. Gulhane & S. Badhe Sanjay
DYPIT, Zeel COE (SPPU), Pune, India
D. Shirbahadurkar Suresh

Authors

Sushen R. Gulhane
View author publications
You can also search for this author in PubMed Google Scholar
D. Shirbahadurkar Suresh
View author publications
You can also search for this author in PubMed Google Scholar
S. Badhe Sanjay
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of CSE, RVS Technical Campus, Coimbatore, Tamil Nadu, India
S. Smys
Tokyo Institute of Technology, School of Computing, Tokyo, Japan
Abdullah M. Iliyasu
Department of Telecommunication Engineering, Czech Technical University in Prague, Prague, Czech Republic
Robert Bestak
College of Information Science & Engineering, Wenzhou Medical University, Wenzhou, China
Fuqian Shi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gulhane, S.R., Shirbahadurkar Suresh, D., Badhe Sanjay, S. (2020). Identification of Musical Instruments Using MFCC Features. In: Smys, S., Iliyasu, A.M., Bestak, R., Shi, F. (eds) New Trends in Computational Vision and Bio-inspired Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-41862-5_97

Download citation

DOI: https://doi.org/10.1007/978-3-030-41862-5_97
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41861-8
Online ISBN: 978-3-030-41862-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Identification of Musical Instruments Using MFCC Features

Abstract

Similar content being viewed by others

Musical Instrument Classification Based on Machine Learning Algorithm

Improvement of Audio Feature Extraction Techniques in Traditional Indian Musical Instrument

Automatic musical instrument classification using fractional fourier transform based- MFCC features and counter propagation neural network

Keywords

1 Introduction

2 System Overview

3 System Design Methodology