Abstract
This paper represents the flow pattern classification utilizing the recorded sound of flows and their spectrograms synthesized images. The sound of four different flow patterns, stratified, churn, annular and slug flow, are recorded and converted to spectrogram for further analysis. The proposed method uses a new version of the local binary pattern (LBP) to extract robust features from the returned audio noise. We propose an adaptive threshold function based on Gaussian distribution's cumulative density function (CDF). The proposed algorithm performance is analyzed with both the 836 recorded sound of flow pattern and RWCP database to validate enhancements. The validation is done for two scenarios, one with the same noise signals for the training and test sets and one for different noises for each set. Furthermore, the new method is compared with three other methods (e.g., MFCC-HMM, SIF-SVM and MC-BDLBP). The comparisons show the new method's better performance with reduced complexity.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Classifying the type of a 2-phase flow is an important practice in different industries, such as oil, gas, petrochemical and food. This specifically helps the process control system to change parameters to control the occurrence of them or prevent some unwanted damages that some of these patterns may create [1, 2]. When it comes to identifying these patterns, there are usually two main techniques in use. The first one is basically done by analyzing the parameters, e.g., finding the evidence of the existence of a particular type of flow from the parameters such as velocity and pressure changes through the pipe. The second one more focus on the processing of the recorded sound of the flow. Understandably the parameter analysis method is not an automated technique and requires much more time for each instance. The latter method, however, falls into two major techniques: Putting the sound signal through the classification algorithms directly [23] or analyzing the image representation of the flow sound spectrogram via image processing techniques. Due to its cost-effectiveness, ease of implementation and vast variation of the image processing techniques, analyze of the spectrogram of the flow sound as an image seems more convenient method. The fact that this identification is made by image processing creates numerous possible methods for classification of flows. 2-phase flow classification is specified in two modes (1) a horizontal pipe and (2) a vertical pipe. In the cases that gravity is vertical to the pipe, phase separation occurs in a horizontal pipe. Some of the common 2-phase patterns in the industry are bubbly, Annular, plug, churn, stratified and slug flow [2].
The bubbly flow is defined as when a continuous stream of liquid exists with bubbles scattered through it. In the annular, liquid exists around the pipe's crust, and gas can be found around the center. The bubbles are unified in the plug flow pattern, and bigger bubbles can be found close to the pipe's wall. In cases of an oscillatory flow, we are dealing with Churn flow. The smoothness of the horizontal contact in a stratified pattern completely stratifies the liquid and gas. Slug flow is the other form, which is seen in the crude oil and tar production plants, particularly in pipelines conveying crude oil or liquid gas [3].
SVM and neural network feature extraction methods are the techniques used for 2-phase flow classification [4, 11]. The authors of this study [5] measure the performance of the airlift pump at various submergence ratios and compare the results to Hewitt and Roberts' flow pattern map [6]. Flow classification is represented using textural features and SVM classifier in [7]. Furthermore, a fuzzy neural network is proposed to identify the flow pattern of bubbly, slug and plug flow in [8]. However, according to the existing literature, flow characterization is facing some obstacles. Because these characterizations are dimensionally sensitive, they can only be used with the specific parameters of their research. Creating a universal flow pattern map for various liquids and pipe settings is a difficult undertaking. The fact that some flows may transit at a Weber number, and some might be justified by Reynolds number makes it almost impossible to come up with a universal dimensionless flow pattern map. Figure 1 depicts the occurrence of various 2-phase flow models in a horizontal pipe [9].
This paper presents flow type identification based on the audio signal classification to tackle the mentioned flow mapping problems. The hidden Markov model (HMM), which is a traditional speech-recognition technique and focuses on mel-frequency cepstral coefficients (MFCC), was used in the early research related to audio signal classification [10, 12]. Later on, other techniques like MPEG-7 audio features [13], matching pursuit features [14] and spectro-temporal signature [15] were developed as the supplement for MFCC. Techniques such as mel-frequency cepstrum [16], short-time discrete Fourier transform (STFT) and Gabor spectrogram [17] are widely used in audio signal classifications due to their ability to work with spectrums.
The spectrogram of the audio signal provides helpful information in sound classification [24]. New approaches investigate spectrogram as a textured image [18,19,20,21,22, 25] and use image processing techniques for audio signal classification. The spectrogram is a composite image that illustrates the distribution of a signal's intensity in multiple frequencies at different time intervals. It is not a standard image. By recording the intensity differences between a pixel and its neighbors, the local binary pattern (LBP) approach retrieves the spectrogram's micro-patterns.
The LBP feature extractor has different forms, and all of these variations are commonly utilized in a variety of image processing applications [25, 26]. Typically, the LBP features obtained from the spectrogram have different applications. One application is related to audio signal classification [26]. The LBP features’ drawback is their sensitivity to noise [27]. Different factors affect the reliability of the LBP features. These issues are mostly due to a lack of components in the LBP histogram. Processing patch–wise LBP is one of the problems that result in fewer components in each patch's histogram. In addition, the occurring frequencies patterns are vastly different, so these patterns cannot be precisely evaluated. The LBP feature extraction with an adaptive threshold is proposed to deal with such problems. The proposed method is used to classify and identify four flow patterns from sound signals. Furthermore, the new approach is compared with three works in [18, 19, 26] using the RWCP database.
The rest of the paper is organized as follows:
Section 2 briefly explains the experiment's setup for recording the flow's sound in the pipeline. The evaluation of the proposed method is discussed in Sects. 3. Section 4 presents the simulation results. Finally, in Sect. 5, the conclusion is represented.
2 Experiment's Setup
Two hydrophones are used in the setup, and both of them are connected by an amplifier to a signal analysis workstation. The position of the hydrophones is shown in Fig. 2. The system's input is the signals with a frequency range of 0.1–10 kHz. The diameter of the pipe's upward side and downward side is 5 inches and 3 inches, respectively. Figure 3 illustrates the schematic of the pipe with the position of two hydrophones.
3 Proposed LBP with Gaussian Reweighting Thresholding
3.1 Texture Analysis
In most applications, time–frequency analyses are used to create the spectrogram of the signals. The Gammatone filters and short-time Fourier transform are two instances of the existing approaches [18, 19]. The main difference between these two specific methods is in the spacing of the frequency positions on the axis. In gammatone-like spectrograms, frequencies have equal distances from each other and are scattered equally on the equivalent rectangular bandwidth (ERB). These properties make the gammatone approach a closer approximation of the human sensory mechanism compared to the linear spaces of the frequency points on the STFT spectrogram. Time–frequency representations are suitable options to provide rich information about the texture. Of the main spectrograms, the Gammatone-like spectrogram (LGS) logarithm visually contains more texture information, making the LGS a suitable candidate.
For conducting the analysis, a collection of 50 Gammatone filters for the spectrogram are used. Frequency points on the ERB scale spaced in the interval of 100 and 10,000 Hz [18, 19]. The spectrogram of Gammatone is shown by S (f, t), in which f represents the frequency, and t shows the time. For the improvement of the low-power instances, the following formula is used. This helps to gain more details, as illustrated in [26]:
The logarithm of the Gammatone-like spectrogram could be obtained by the following formula [26]:
3.2 Traditional LBP Features
Many feature extraction approaches have been proposed to analyze the image. Some of these approaches are SIFT [28], HOG [29] and LBP [30]. LBP is preferable among these feature extraction algorithms due to its easy implementation and effective results. It is crucial to consider the spectrogram as a synthesized image and acknowledge the many differences that exist between a spectrogram and a standard image. The following are some of the visual distinctions between a spectrogram and a standard image: [26]:
Smoothness: the first difference is that a standard image has a smooth dispersion, and the intensities change smoothly; on the other hand, in the spectrogram, the neighboring pixels could be remarkably different. The reason is that these pixels represent the power distribution of a sound signal, and they could be very different throughout the time–frequency instances.
Translation, scaling and rotation: The location and scales could vary for different shots in standard images. However, translation may only appear along the time axis in the spectrum. Hence, feature extraction methods such as SIFT, which extracts scale-invariant features, are not useful for spectrograms.
Micro-structure: Edges, spots and corners are usually considered essential details in natural images, but these micro-structures may not appear in the spectrogram. Therefore, edge-based feature extraction approaches such as HOG may not be appropriate for frequency analysis.
This paper considers LBP features the best way to classify flow patterns due to the following advantages. (1) The LBP feature works with signs of intensities which makes it resistant to spectrogram image monotonic changes. (2) In addition to edges and corners, LBP can extract other micro-structures which are not available to HOG. These advantages make the LBP a fit choice for the spectrogram feature extraction.
Conventional LBP algorithms use a fixed zero threshold [30]. The LBP code first runs on all the pixels. A pixel's intensity is denoted by \(i\). The \(i_p\) determines the intensity of the \(p_{th}\) neighboring pixel, where p = 1,2…P and P is the number of neighbors. Here, \(R\) defines the distance between \(i_p\) and \(i\). Considering mentioned notations, \(\text{LBP}_{p,R}\) encodes the pixel difference \(z_p = i_p - i\) between \(i_p\) and \(i\) with \(R\) as the distance. Encoding every LBP bit with the following:
The LBP code is formed as \(\overrightarrow {b_p b_{p - 1} \ldots b}\) from the mentioned LBP bits. After that, the \(\text{LBP}_{p,R}\), histogram of LBP codes with the dimension of \(2^p\) is used as features vectors. It is evident that using more neighbors leads to more information on the image, increasing the dimension.
3.3 Proposed Method
Illumination variation is an important factor that affects the pixel intensity changes of the standard images. Thus, the pixel difference \(z_p\) also has a significant variation. It can be concluded from Eq. (3) that only the sign of the pixel variation is used by the LBP, and the amplitude is being dismissed. The amplitude of the pixel difference contains meaningful data describing the spectrogram. Therefore, both the direction and the amplitude of the pixel variation are used in [26] when extracting the patterns from the spectrogram. In conventional LBP features, a fixed zero threshold is proposed in [27], making it sensitive to noise. The noise can affect the system by changing the pattern bits for slight differences. A multi-channel LBP (MCLBP) feature to obtain the sign and amplitude of the pixel differences quantizes the pixel differences using multiple thresholds is proposed in [26]. One way to deal with this issue is to set the threshold of the pixel differences at \(T_i\) instead of thresholding at 0 [26, 33 ].
where \(i\) indicates the channel number and \(b_p^i\) is the LBP's \(P\)th bit in \(i\). For each channel, a fixed user-defined threshold is used in this method. This paper proposes a new adaptive threshold function. This function is based on the local and global spectrogram information. This method does not make a different image band, but the proposed threshold changes in each frame. The proposed threshold function is based on a cumulative density function (CDF) of Gaussian distribution function to have both fast and robust LBP features, as below:
where \(M\) is the number of spectrogram frames, \(\sigma\) and \(\mu\) are standard deviation and global mean, and \(\mu_i\) and \(\sigma_i\) are the mean and standard deviation in a frame of the image. The error function is described by \({\text{erf}}\left( x \right)\) as bellow:
The proposed function is plotted in Fig. 4 for three different values of \(\mu\) and \(\sigma\).
Finally, each encoding LBP is obtained as below in adaptive threshold LBP (AT-LBP):
Noise properties have a determining role in the formation of the threshold function. When image noise increases, the threshold value will be increased and vice versa, leading to more robust LBP features.
4 Simulation Results
This section compares the flow pattern identification through spectrogram of recorded sound signals with the proposed method and conventional LBP. The proposed method is investigated for four flow patterns; Stratified, churn, annular and slug. In addition to that classification on the RWCP database is conducted for further validation of the proposed method and comparing the results to three other methods: MFCC-HMM [19], Spectrogram Image Feature (SIF) [18] and MC-BDLBP [26]. The RWCP database [31] is an appropriate benchmark in validating the performance of the classification algorithms in many research. The database consists of crash sounds of plastic, wood and other sounds like coins, bells, saw, etc. To study the effectiveness of the proposed method, the MFCC-SVM and SIF-SVM are applied by using the proposed AT-BLP features instead of MCC-HMM and SIF. The experiment is performed under four noise levels: 20, 10, 0 and −5 dB using the NOISEX92 database [32]. The applied linear SVM has a cost parameter set to 40 for classification. In the first experiment for the classification of pipeline flow, all the signals are sampled in 16 bits at 50 kHz using two hydrophones. The samples are collected from four classes of flow pattern types consisting of 836 samples in total. The number of samples for each class is shown in Table 1. In this experiment, 30% of samples are used for training and added noise is the same for training and test samples. As shown in Table 2, the proposed method has 98.7% correct classification for clean samples, whereas the traditional LBP represents 96%, and the new method shows much better performance when noises are added.
The RWCP contains 9722 sound events files with 105 distinct sets. The same partitioning of RWCP as in [18, 19, 26] is applied. A total of 50 audio signal stets are chosen from the RWCP database. The total 2500 and 1500 audio signals are considered for training and testing purposes. As in [26], two scenarios for adding noise are employed. In the first scenario, the SNRs are uniformly sampled from [−5 25] with the same type of noise in both training and test samples. The recognition rates are represented in Table 3. As shown in Table 3, the proposed method shows a better classification rate compared to the state of arts in [18, 19]. However, the method in [26] has better performance in the RWCP database at the cost of more complexity.
Table 4 illustrates the classification rates for the second scenario where the testing noise type is different from the training. The different noise types in the training and test samples cause a reduction in the recognition rate in this case. In this case, the proposed approach shows a better classification rate in most noise types than the MFCC-HMM and SIF-SVM. As shown in Table 3. The results are as par with MC-BDLBP while maintaining lower complexity.
5 Conclusions
This paper proposes a new LBP feature based on a new cumulative density function of Gaussian distribution thresholding. The algorithm is an enhancement of the conventional LBP algorithm. A comparison is made between the new algorithm and two other feature extraction techniques (e.g., MFCC-HMM and Spectrogram Image Feature (SIF)) through Mont Carlo simulations. Simulation results have shown the better performance of the proposed method by providing a better classification rate and less complexity.
References
Dinh TB, Kim BS, Choi T-S (1999) Application of image processing techniques to air/water two-phase flow. In Applications of digital image processing XXII, vol 3808. International Society for Optics and Photonics, pp 725–731
Van Hout R, Barnea D, Shemer L (2001) Evolution of statistical parameters of gas-liquid slug flow along vertical pipes. Int J Multiphase Flow 27(9):1579–1602; Chen J, Richard C, Bermudez JC, Honeine P (2014) Variants of non-negative least-mean-square algorithm and convergence analysis. IEEE Trans Sig Proc 62(15):3990–4005
Shanthi C, Pappa N (2017) An artificial intelligence-based improved classification of two-phase flow patterns with feature extracted from acquired images. ISA Trans 68:425–432
Shanthi C, Pappa N, Suganya JA (2013) Digital image processing based flow regime identification of gas/liquid two-phase flow. IFAC Proc 46(32):409–414
Hanafizadeh P, Ghanbarzadeh S, Saidi MH (2011) Visual technique for detection of gas-liquid two-phase flow regime in the airlift pump. J Petrol Sci Eng 75(3–4):327–335
Hewitt GF, Roberts D (1969) Studies of two-phase flow patterns by simultaneous x-ray and flast photography. In: Atomic energy research establishment. Harwell, England (United Kingdom)
Huang G, Ji H, Huang Z, Wang B, Li H (2011) Flow regime identification of mini-pipe gas-liquid two-phase flow based on textural feature series. In instrumentation and measurement technology conference (I2MTC), 2011 IEEE. IEEE, pp 1–4
Shi L (2007) Fuzzy recognition for gas-liquid two-phase flow pattern based on image processing. In: Control and automation, 2007. ICCA 2007. IEEE international conference on, 2007. IEEE, pp 1424–1427
Bothamley M (2013) Gas/liquid separators: quantifying separation performance-Part 1. Oil Gas Facilities 2(04):21–29
Cowling M, Sitte R (2003) Comparison of techniques for environmental sound recognition. Pattern Recogn Lett 24(15):2895–2907
Guo G, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14(1):209–215
Temko A, Nadeu C (2005) Classification of meeting-room acoustic events with support vector machines and variable-feature-set clustering. In: Acoustics, speech, and signal processing, 2005. Proceedings.(ICASSP'05). IEEE international conference on, 2005. vol 5. IEEE, pp. v/505–v/508
Muhammad G, Alghathbar K (2009) Environment recognition from audio using MPEG-7 features. In: Embedded and multimedia computing, 2009. EM-Com 2009. 4th international conference on, 2009. IEEE, pp 1–6
Chu S, Narayanan S, Kuo C-CJ (2009) Environmental sound recognition with time-frequency audio features. IEEE Trans Audio Speech Lang Process 17(6):1142–1158
Tran HD, Li H (2011) Sound event recognition with probabilistic distance SVMs. IEEE Trans Audio Speech Lang Process 19(6):1556–1568
Logan B (2000) Mel frequency cepstral coefficients for music modeling. ISMIR 270:1–11
Qian S, Chen D (1999) Joint time-frequency analysis. IEEE Signal Process Mag 16(2):52–67
Dennis J, Tran HD, Li H (2011) Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process Lett 18(2):130–133
Dennis J, Tran HD, Chng ES (2013) Image feature representation of the subband power distribution for robust sound event classification. IEEE Trans Audio Speech Lang Process 21(2):367–377
Deshpande H, Singh R, Nam U (2001) Classification of music signals in the visual domain. In Proceedings of the COST-G6 conference on digital audio effects, pp 1–4
Paraskevas I, Chilton E (2003) Audio classification using acoustic images for retrieval from multimedia databases. In: Video/image processing and multimedia communications, 2003. 4th EURASIP conference focused on, 2003, vol 1. IEEE, pp 187–192
Yu G, Slotine JJ (2009) Audio classification from time-frequency texture. In: Acoustics, speech and signal processing, 2009. ICASSP 2009. IEEE international conference on, 2009. IEEE, pp 1677–1680
Matsui T, Goto M, Vert J-P, Uchiyama Y (2011) Gradient-based musical feature extraction based on scale-invariant feature transform. In: Signal processing conference, 2011 19th European, 2011. IEEE, pp 724–728
Kim MJ, Kim H (2012) Audio-based objectionable content detection using discriminative transforms of time-frequency dynamics. IEEE Trans Multimedia 14(5):1390–1400
Costa YM, Oliveira L, Koerich AL, Gouyon F, Martins J (2012) Music genre classification using LBP textural features. Signal Process 92(11):2723–2737
Ren J, Jiang X, Yuan J, Magnenat-Thalmann N (2017) Sound-event classification using robust texture features for robot hearing. IEEE Trans Multimedia 19(3):447–458
Ren J, Jiang X, Yuan J (2013) Noise-resistant local binary pattern with an embedded error-correction mechanism. IEEE Trans Image Proc 22(10):4049–4060
Geng C, Jiang X (2011) Face recognition based on the multi-scale local image structures. Pattern Recogn 44(10–11):2565–2575
Satpathy A, Jiang X, Eng H-L (2014) Human detection by quadratic classification on the subspace of extended histogram of gradients. IEEE Trans Image Process 23(1):287–297
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Nakamura S, Hiyane K, Asano F, Nishiura T, Yamada T (2002) Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition
Varga A, Steeneken HJ (1993) Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun 12(3):247–251
Safaei N, Smadi O, Safaei B, Masoud A (2021) Efficient road crack detection based on an adaptive pixel-level segmentation algorithm. Transp Res Rec 2675(9):370–381
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Parsai, S., Ahmadi, M. (2023). Flow Pattern Recognition Using Spectrogram of Flow Generated Sound with New Adaptive LBP Features. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Seventh International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 464. Springer, Singapore. https://doi.org/10.1007/978-981-19-2394-4_37
Download citation
DOI: https://doi.org/10.1007/978-981-19-2394-4_37
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2393-7
Online ISBN: 978-981-19-2394-4
eBook Packages: EngineeringEngineering (R0)