Abstract
Punjabi is tonal as well as under resource language among all the Indo Aryan languages of the Indo-European family. A vast number of variations in language lead to challenges while designing an Automatic Speech Recognition (ASR) system. Therefore, it turned out to be a matter of extreme concern to study the essential features such as tone of the language for designing an effective ASR. This paper lays its focus upon the variation of tonal characteristics of Punjabi dialect. The speech corpus has been collected from native speakers of Punjab (including all the various dialects) and also covering the areas under the Himachali belt of Punjab. The result analysis shows that tonal words and dialectal word information caste a major impact on the information conveyed by the speaker. The analyzed data shows pitch variations in tonal words that vary from region to region. The experiments are performed by using Praat toolkit for calculating F0 value; then depending upon the pitch and frequency variations, we have studied that tonal words show dialectal variations when the similar sentence is spoken by speakers of different regions.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
In a land full of diversities, such as India, the various states and provinces follow their respective spoken languages. According to Indian grammarians, there are three grades of recognized accents: “Udatta” which means a raise or elevator that indicates the highest pitch. Second, the term “Anudatta” stands for unelevated or non-accent pertaining it to a low-pitched syllable. “Svarita” the third accent is a mixture of both low and high pitches within a syllable. When this Vedic nomenclature of accents is compared to the tones of Punjabi language in the work done by Grierson, it was found out that, only “Udatta” could be compared to the high tone of Punjabi language, whereas as compared to “Svarita”, the low tone of Punjabi language falls for the first syllable but rises for the rest [1]. The cause for this mismatch became the basic motivation for this study because there are various aspects of Punjabi language which remain uncovered because of it being an under resource language. The variation in these tones is a resultant of the position of the five tonemes ( ) of Punjabi language [2]. Other than these tonemes, one of the phonemes also exhibits tonal characteristics, when placed at the final position, and shows a leveled tone whenever it occurs at the initial position. The differences in the pitch values gathered more interest when it was noticed that the dialectal differences when analyzed exhibit different tonal characteristics. The later part of the paper covers the analyses of pitch variation among most of the dialectal words that have been caused because of the regional variations across the same state. The importance of this study lies in the fact that the correct identification of the lexical is a crucial task while designing any speech processing system.
2 Literature Review
According to ethnologies 2005 estimate, there are 88 million native speakers of Punjabi language which make it 10th most widely spoken language in the world and according to 2001 Census of India, there are 29,102,477 Punjabi speakers in India [3]. But yet the area of research has not been progressive for this language though. The research work/the features of the speech signal in accordance with Tones have been majorly studied for mandarin languages. A tone detection methodology for Mizo language was designed in 2015 that used quantitative analysis of acoustic features of Mizo language [4]. Here, the tone was detected by relying on slope and height due to the availability of a large database. The z- score normalization of the signal is used for eliminating the effects of gender and then the pitch variance results were comparatively analyzed to distinguish whether the tone can be marked as high, low, falling or rising tone. Singh Panday and Aggarwal’s (2015) study of Punjabi Tonemes [3] covered the five Tonemes of Punjabi language and their high, low, and mid-tones, and the paper also throws light on the IPA of these high and low tones words. A representation of an experimental study of the tonal characteristics of the laryngeal phoneme of Punjabi language included the study on words containing phonemes of Malawi dialects carrying tonal effects, as recorded from native Punjabi speakers and then experiments were performed using Praat and Matlab [2]. Tonal analysis of the /h/ phoneme is studied using the (f0) fundamental frequency contour. This study showed that at a syllable level /h/ can reflect tonal occurrences whereas no such thing is observed when /h/ is considered at its initial positions [2]. Analysis of vowel phonemes in Punjabi has been performed but still there persists a twofold interface on acoustic features of vowels in two different languages [1]. The paper throws a lot of light on the fact that the effect of the other non-native languages and changing scenarios has a significant impact on the original Punjabi language, and it is one of the essential features to be kept in mind by designing any ASR system. Furthermore, the detection of Mizo Tones [4] included a lot of technical study over the tonal lexicon in Mizo language. Another paper on the Lexical stress in Punjabi language and its representations in PLS included a lot of linked information with PLS design and a new study about the relation between suprasegmental phonemes such as tone, nasalization, and stress at syllable level [5]. The study made it evident that the nontonal disyllabic words can also carry stress on the second syllable, which can be illustrated through the IPA, which contains the encoded PLS data.
3 Speech Corpus Structure
The regions of undivided Punjabi included the Malwa, Doaba, Majhi, and Puadh regions of Punjab along with the Himachali belt (Rullui, Mandiali, Kangri, and Chambiali). The designed corpus consists of dialectal varieties of undivided Punjab. Thus, the speech corpus was enriched by including the dialectal linguistic differences/varieties from the regions of Undivided Punjab. The sample speech sentences employed in the corpus are shown in Fig. 1.
The input signal is recorded at 44 kHz using the Sound Forge Software in studio environment. The accuracy and efficiency estimation of pitch of the analyzed dataset is performed on the basis of Tonal and Dialectal variations, through Praat software. The speech corpus includes certain words whose pitch and lexicon vary from region to region; some of the words are depicted in Table 1.
The idea /motive behind this design of the dataset/corpora were to include all the possible dialectal variations of Punjabi language considering its dialect dependent tonal variations. Section 4 shows speech signal would be modeled to conclude the necessity for study of tones while designing a speech system, especially for an under resource language like Punjabi.
4 Speech Signal Modeling
An input signal is studied for the tonal variations caused because of dialectal differences and of position of its vowel. While modeling a speech signal the recorded signal was studied on the base of tonal variations on three grounds (low, high, mid). Figure 2 shows the canonical pitch contour for Punjabi language. The high tone is a rising–falling tone ( ), low tone is a falling tone ( ), and mid-tone has an intermediate pitch between high and low tones [6].
Further analysis of the signal is done as per the block diagram shown in Fig. 3.
The input signal is based on dialectal variations of Punjabi language. The signal is then Z-score normalized over a fundamental frequency (F0) to immune it to the gender effects. The tonal word is identified as per the annotations provided in the Praat software the pitch and intensity contours are analyzed from the given input signal.
Z-Score Normalization
As pitch variation due to gender difference is a factor to be overcome while processing the speech signal, thus the Z-score of the pitch contour is taken to normalize the data to certain frequency that makes it gender independent. The Z-score takes a sample within a set of data and determines the number of standard deviations above or below it. The Z-score of a sample can be calculated using the equation given as [4],
where \( \upmu \) is mean and \( \upsigma \) is standard deviation.
Figure 4 represents the effect of Z-score normalization on the recorded input sound signal by a male and female speaker of the same dialectal region, respectively.
Role of F 0 frequency
There is a substantial amount of data on the frequency of the voice fundamental (F0) in the speech of speakers who differ in age and sex [3]. The voice fundamental frequency plays a very important role while differentiating the male and female speakers. Published data on the frequency of the voice fundamental (F0) in speech shows its range of variation, often expressed in terms of two standard deviations (SD) of the f0-distribution, to be approximately the same for men and women if expressed in semitones [7]. The male speakers have a low F0 and the female speakers have a high F0; therefore, only F0 values cannot be used for the representation of underlying tonal features of a language. Table 2 states the different values of fundamental frequencies (F0) for the various dialects.
Comparative Lexicon Tonal Analysis (Pitch Contours)
As stated before, the dialectal differences as well as the position of the Toneme determine the variation in the pitch and the intensity contour of the dialectal and tonal words.
The effect of tonemes position on the tone of the signal has already been analyzed in Fig. 2.
The effect of tonemes position on the tone of the signal has already been analyzed in Fig. 2 above. The results of this comparative analysis among the dialectal variations are shown through the given Fig. 5. The following graphs are some of the few examples of the pitch contours for the dialectal words in the recorded dataset.
5 Results and Experimental Analysis
The pitch boundaries based on the fundamental frequency of the signal show the dialectal and the tonal variations of the word in different regions. The values of the fundamental frequency (F0) have already been determined and illustrated in Table 2. Table 3 represents the change in the values of the pitch and corresponding Intensity variation of the dialectal varieties of the input signal. From this experiment, it was analyzed that the values of mean pitch for the Punjabi dialects of the present regions of Punjab range from 130.16 to 250.62 and for the Himachali belt of undivided Punjab it ranges from 124.75 to 248.01. Though the ranges are quite similar to one another, we could easily judge from the table given above that the difference in Pitch variation is minimum in the Punjabi dialect Majhi and Punjabi dialect across Himachali belts (Chambiali and Mandiali). These regions offer higher values of pitch indicating the occurrence of more high tone signals. The regions of Kullui, Malwai, and Doabi offer a similarity with the lower values of mean pitch indicating more of low tones. The intensity though remains as an almost constant value which shows a negligible amount of change when observed for the various Punjabi dialects of the regions of Undivided Punjab. Nevertheless, the Chambiali region reflects the intensity and mean pitch very close to the Modern Punjabi dialectal regions as compared to the others.
6 Conclusions
The paper describes the dialectal variations reflected in the tone of the signal. The pitch, intensity, and fundamental frequency variations of the signal are studied. The pitch boundaries based on the fundamental frequency of the signal show the dialectal and the tonal variations of the word in different regions. The determined values are important because inclusion of tonal information of the words while designing the ASR can show a considerable increase in the efficiency of the designed system.
References
Singh, A., Pandey, D., Agrawal, S.S.: Analysis of Punjabi tonemes. In: 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom). IEEE (2015)
Lata, S., Samarth A.: Laryngeal tonal characteristics of Punjabi—an experimental study. In: 2013 International Conference on Human Computer Interactions (ICHCI). IEEE (2013)
LIU J., HE X., MO F., YU T.:Study on tone classification of chinese continuous speech in speech recognition system
Sarma, B.D., et al.: Detection of mizo tones. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Lata S., Arora, S., Kaur, S.: Lexical stress in Punjabi and its representation in PLS
Lata, S.: Challenges for design of pronunciation lexicon Specification (PLS) for Punjabi
Traunmüller, H., Eriksson, A.: The frequency range of the voice fundamental in the speech of male and female adults. Consulté le 12(02), 2013 (1995)
Sarmah P., Dihingia, L., Lalhminghlui, W.: Contextual variation of tones in mizo. In: Proceedings of Interspeech (2015)
Lata S., Arora, S.: Exploratory analysis of Punjabi tones in relation to orthographic characters: a case study
Sarmah P., Dihingia, L., Lalhminghlui, W.: Contextual variation of tones in mizo
Bansal, I., Sharan, S., Kilt, A. (n.d.).: Corpus design and development of an annotated speech database for Punjabi. College of Engineering, Gurgaon
Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. Issues 9(4), 359–364 (2012)
Kaur, E.A., Singh, E.T.: Segmentation of continuous Punjabi speech signal into syllables. In: The World Congress on Engineering and Computer Science (WCECS), vol. 1, pp. 20–23 (2010)
Lata, S., Arora, S. (n.d.).: Exploratory analysis of punjabi tones in relation to orthographic characters : a case study, pp. 3–7
Sarma, B.D., Sarmah, P., Lalhminghlui, W., MahadevaPrasanna, S.R.: Detection of mizo tones. In: Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech, 2015), pp. 934–937 (2015)
Singh, P., Dutta, K.: Formant analysis of Punjabi non-nasalized vowel phonemes. In: 2011 Proceedings of the International Conference on Computational Intelligence and Communication Systems (CICN 2011), pp. 375–380 (2011). https://doi.org/10.1109/CICN.2011.79
Traunmüller, H., Eriksson, A.: The frequency range of the voice fundamental in the speech of male and female adults (Cash Notes 2014). Department of Linguistics, University of Stockholm, vol. 97, pp. 1905–1915 (1994)
Kumar, A.: Implementation and performance evaluation of continuous Hindi speech recognition (1) (2014)
Acknowledgements
The authors would like to present their sincere thanks to the people of Punjab who have extended their support and cooperation in the data collection phase. They would also like to thank the authorities of Linguistics Department Punjabi University and Speech and Multimodal lab, Chitkara University for their extended support in the fulfillment of requirements for data processing. They would also like to thank Ms. Suman Preet and Mr. Surjit Singh for their help during corpus preparation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Arora, A., Kadyan, V., Singh, A. (2019). Effect of Tonal Features on Various Dialectal Variations of Punjabi Language. In: Rawat, B., Trivedi, A., Manhas, S., Karwal, V. (eds) Advances in Signal Processing and Communication . Lecture Notes in Electrical Engineering, vol 526. Springer, Singapore. https://doi.org/10.1007/978-981-13-2553-3_45
Download citation
DOI: https://doi.org/10.1007/978-981-13-2553-3_45
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2552-6
Online ISBN: 978-981-13-2553-3
eBook Packages: EngineeringEngineering (R0)