Keywords

1 Introduction

Konkani is an Indo-Aryan language that belongs to the Indo-European family of languagesFootnote 1. More than 2.5 million people speak Konkani language. It is the official language of the state of Goa in India and is spoken in the western coastal part of India, including Goa, the Konkan region of Maharashtra, Karwar, Mangaluru, other coastal areas of Karnataka, and parts of Kerala, Gujarat, Dadra & Nagar Haveli and Daman & Diu. Konkani is one of the 22 scheduled languages included in the eighth schedule of the Constitution of IndiaFootnote 2. Konkani and Marathi are often referred to as sister languages, as many words have similar formations and semantics with some variations. The first known Konkani inscription dates back to 1187 CEFootnote 3.

The work presented here is an attempt to create a resource which can be used for Konkani language development. This paper describes the work of creating a phonetic transcription system for Konkani, which can be considered as one of the important component to develop an Automatic Speech Recognition (ASR) System and Text to Speech (TTS) for any language, in this case, for Konakni language.

The paper is organised as follows:- Sect. 1 provides an introduction of the Konkani language and presents the need for this work. The motivation of this work is briefed in Sect. 2. Section 3 defines the problem statement, Sect. 4 presents the Discussions and Methodology used; Finally, Sect. 6 Concludes the paper with future scope for improvements.

2 Motivation

Konkani is a under-resourced language with very few resources available for R & D. Also, there are hardly any applications for the Konkani language. In the recent decade, efforts have been made to develop resources for Konkani, viz., Konkani Wordnet [3, 4, 11], ILCI Corpus [2], CIIL Corpus [7], SPTIL ProjectFootnote 4, etc. However, an online Konkani pronunciation dictionary has not been developed as yet. This motivated our research group to create a Konkani phonetic transcription system, which may be beneficial for future research work of TTS and ASR Systems for Konkani Language.

3 Problem Statement

The goal of the work presented in the paper is to design an Automatic phonetic transcription system specific to the Konkani language. This phonetic transcription system takes a Konkani text in written Devanagari form as input and produces phonetic transcription in IPA. The transcriptions are rule-based one-to-one characters to phonemes mapping and currently do not consider the context of the word.

4 Methodology and Datasets Used

4.1 Rules for Phonetic Mapping

The Devanagari character set for Konkani is taken from A Gold standard Konkani Raw Text Corpus [8]. Tables 1, 2, and 3 show the mapping of Konkani characters with the IPA symbols. Rules for phonetic transcription are identified using previous work, which is reported by [1, 5, 9]. Some IPA mapping for the characters not reported through the above work, has also been provided. Some rules are summarized and presented in a tabular form. The Devanagari characters, approximate IPA notation, phonetic transcription for the dictionary and UTF-8 code for the characters are also presented here. Table 4 provides approximate IPA symbols for Devanagari vowels and diphthongs. Konkani has nine vowels out of which six vowels find place in the script whereas three do not. The nine vowels of the language are: . The three vowels that are a part of the vowel system of the language but do not have a unique character representing them in the script are: , and . There are a few other things that need to be noted with regards to the vowels system of the language:

  • Vowel length is not phonemic in Konkani. Hence, one of the Devanagari character representing vowel length contrast, namely and is redundant for the language.

  • Vowels from Sr. No. 16 in Table 4 are not found in the language.

  • The vowel [ ] occurs in Sanskrit loans only and specifically in proper nouns only. It should be noted that some vowel phonemes in Konkani, like / / and / /, have the same written representation [5].

Before elaborating on the consonant inventory of the language, it would be worth mentioning that voicing and aspiration is phonemic in the language. Nasalization is phonemic in the language with the vowels displaying oral-nasal contrast in almost all positions. The language also has nasal consonant phonemes /m/, /n/, / /. Speaking about the labial consonant phonemes, Konkani distinguishes between the voiceless /p/ and the voiced /b/ as well as the aspirated / / versus the voiced non-aspirated /b/. With the exception of the voiceless aspirated labial consonant ], other consonants contrast in voicing and aspiration. Scholars have claimed that ] did exist in the older stage of the language but was replaced by the labio-dental fricative /f/ due to the large scale Portuguese borrowings in the language.

Examples of minimal pairs exhibiting differences between labial sounds are given below:

  • There is a contrast between the voiceless plosive /p/ and the voiced plosive /b/., e.g. [paj] ’father-M.SG.’ [baj] ’endearment word for a girl child-N.SG.’

  • The voiceless plosive /p/ also contrasts with the aspirated voiced plosive / /), e.g. [ ] ’son-M.SG.’ [ ] ’ghost-N.SG.’

Four dental phonemes / /, / /, / / and / / display voicing and aspiration contrast. Examples for these phonemes are given below:

  • voiceless dental plosive / / versus voiced dental plosive / /), e.g., [ ] ‘wick-F.SG.’; ‘candle-F.SG.) [ ] ‘dispute’, ‘argument’ (M-SG.)

  • voiceless non-aspirated dental plosive / / versus voiceless aspirated dental plosive / /, e.g., [ ] ’clap-F.SG.’ [ ] ‘small plate for eating-F.SG.’

  • voiced non-aspirated dental plosive / / versus voiced aspirated dental plosive (/ /), e.g., [ ] ‘door-N.SG.’ [ ] ’sharp edge-F.SG.’

With respect to the place of articulation, all the above dental consonants contrast with the retroflex consonant phonemes / /, / /, / /, and / / which also display voicing and aspiration differences. The following pairs of words make this distinction explicit:

  • voiceless retroflex plosive / / versus voiced retroflex plosive / /, e.g., [ ] ‘way/path-F.SG.’ [ ] ‘growth-F.SG.’

  • voiceless unaspirated retroflex plosive / / versus voiceless aspirated retroflex plosive / / ), e.g., [ ] ’A narrow water course-M.SG.’ [ ] ’lesson-M.SG.

  • voiceless aspirated versus voiced aspirated (/ / versus / /) [ ] ‘sound of crackers, bullet, etc.’ [ ] ‘loud noise of explosion, fall, etc.’

  • Velar consonant phonemes in Konkani namely also display a contrast with respect to voicing and aspiration. Konkani also has the velar nasal .

With respect to the place of articulation, all the above dental consonants contrast with the retroflex consonant phonemes / , and / / which also display voicing and aspiration differences. The following pairs of words make this distinction explicit: Konkani has dento-palatal affricates / /, / / and / / which contrast with the palatal affricates / /, / /, / / and / /.

With the exception of the voiceless aspirated counterpart of the dental affricate / /, all others contrast with respect to place of articulation, voicing and aspiration. However, the written form of the language (the Devanagari script) lacks separate characters for showing the distinction between these sounds.

Minimal pairs exhibiting meaning differences for dento-palatal and palatal phonemes are given below:

  • Voiceless unaspirated dento-palatal affricate versus voiced unaspirated versus voiced aspirated affricate (/ / versus / / versus / /)[ ] ‘climb-IMP.2P.SG.’ [ ] ‘heavy-ADJ.’ [ ] ‘fall-IMP.2P.SG.’ [ ] ’graze-IMP.2P.SG.’ [ ] ’if’ [ ] ’spring-F.SG.’

  • Voiceless unaspirated palatal affricate versus voiced unaspirated palatal affricate (/ / versus / /) [ ] ‘four’ [ ] ‘tired’,

  • Voiceless unaspirated palatal affricate versus voiced aspirated palatal affricate (/ / versus / /) [ ] ’disciple-M.SG. [ ] ’small garland-M.SG.’

  • Voiced unaspirated dento-palatal affricate versus voiced unaspirated palatal affricate (/ / versus / ]’mature-ADJ.’ [ ] ’(month of) June’

Konkani also has velar consonant phonemes /k/, / /, /g/, / / which show contrast in voicing and aspiration. Meaning differences arising from this opposition are shown below:

  • voiceless unaspirated velar versus voiceless aspirated velar (/k/ versus / /) ’banana tree-F.SG. ’sport/game-M.SG.’

  • voiced unaspirated velar versus voiced aspirated velar (/g/ versus / /) [ ] ;cow-F.SG.’ [ ] ’wound-M.SG’

Konkani nasals - Konkani has three nasals - the bilabial nasal [m], the dental nasal [n] and the retroflex nasal [ ]. These contrast with each other. The occurrence of the velar nasal [ ] and the palatal nasal [ ] is predictable in that they occur as homorganic nasals (as in [ ] ’body-N.SG’, [ ] ’a member of the village council-M/F/N.’)

Contrasts between nasal phonemes is given below:

[ka:n] ’ear-M.SG.’ [ka:m] ’work-N.SG.’

[ ] ’caution-N.SG.’ [ ] ’a large vessel of copper or iron-N.SG.’

There are four fricatives in the language- the labio-dental fricative [f], voiceless alveolar fricative [s], postalveolar [ ] and the voiceless glotal fricative [h]. Some words showing contrast between these sounds are given below:

[fa:r] ’explosion-M.SG.’ [sa:r] ’extract-M.SG.’ [ ] ’city-N.SG.’ [ha:r] ’python; garland-M.SG.’

The retroflex fricative [ ] which is shown in the script, only occurs in the written form as is confined to proper nouns only.

The language has the labio-dental approximant [ ] and palatal approximant [j]. Contrast between these is shown below:

[ ] ’argument-M.SG.’ [ ] ’memory-F.SG.’

The language also contrasts between the dental lateral [l] and retroflex lateral [ ]. The following pair of words display this contrast.

[pa:l] ’lizard-F.SG.’ [ ] ’root of a tree-N.SG.’

The language also has the trill [r]. Although so1

me works refer to its place of articulation as dental, it seems to occur in alveolar position in case of some words. Word pair contrasting this sound with the dental approximant [ ] is given below:

[ra:g] ’anger-M.SG.’ [la:g] ’cajolery; wooing-F.SG.’

The Devanagari script used for the language shows two more characters

  and which are actually consonant clusters [ ] and [dn] respectively. The major limitation of the script is that it does not have characters to show some important contrasts that exist in the language and at times shows characters that are not relevant for the language. A need for revising the script was proposed years back by some scholars

Table 1. Konkani vowels diphthongs and diacritic.
Table 2. Konkani Consonant Set 1.
Table 3. Konkani Consonant set 2.
Table 4. Transcription rules for vowels and diphthongs.

Table 5, presents rules for the vowel diacritics.

Table 5. Transcription rules for Vowel Diacritics.
Table 6. Transcription rules for Ayogavaha.
Table 7. Transcription rules for Consonants.

In Table 6, transcription rules for Chandrabindu, Anusvara and Visarga are presented. Chandrabindu is used for the tatsama (Sanskrit borrowed) words.

In Table 7, rules for consonant transcription are presented. It should be noted that the place of [ ] has been taken by labio-dental fricative [ ], which is said to be an effect of Portuguese borrowings into the language. Konkani language also has dental and palatal affricates that are phonemic in the Konkani language but are written alike in the writing system of the Konkani language. Nasalization and aspiration are phonemic in Konkani.

4.2 Dataset Used

For the testing purpose, we created phonetic transcriptions for Konkani text available from [6] which contains 74 sentences. We also created additional 27 transcriptions of sentences to cover all the phones in the language. These data were used for testing the proposed transcription system.

4.3 Methodology to Create Konkani Phonetic Transcription System and Result

Fig. 1.
figure 1

Konkani Phonetic Transcription System Architecture.

The developed transcription system is a rule-based system and not considering the context of the word. Figure 1 demonstrates the Konkani phonetic transcription system architecture diagram. Various steps were followed in the creation of this phonetic transcription system. Here, the Konkani Devanagari sentence is given as input to the system. This sentence is further broken into tokens or words, and then these words/tokens are broken down into characters for phonetic mapping. Python programming language is used for the design of this system. Mapping from Devanagari to Unicode is done using rules provided in Tables 4, 5, 6 and 7 in this paper. After applying all the transcription rules, we get the final transcription in IPA format.

4.4 Evaluation Metrics

To assess the performance of the phonetic transcription system evaluation metric used is Word Error Rate (WER). WER is the percentage of the words not correctly identified by transcription system from the ground-truth test data set. Word accuracy is calculated by subtracting WER from 100 or it is equal to percentage of the words correctly identified by phonetic transcription system from the from the ground-truth test data set.

5 Results and Discussion

This system exhibits a word accuracy of 40%, with WER of 60. Moreover, when the same dataset is evaluated on the existing Devanagari to Phonetic transcription system [10], it demonstrated a word accuracy of 7.50%. Thus our system performs much better as compared to existing systems. However, there is still scope for improvement in the current system. Errors in the system were mainly with regard to the mapping of some characters to their respective phonemes. In this, the mapping of the character to phonemes and was crucial. The character in the script did not show the closed and open contrast between the phoneme pair. Similar is the case with regard to the back vowels and . The phonemes and also contributed to errors significantly. Alike and , these phonemes too have the same character in the script that represents them. The error was also introduced because the phoneme does not get omitted at all places. Identifying schwa deletion rules in depth might help improve the transcription system’s performance.

6 Conclusion and Future Work

In this paper, we presented the Konkani pronunciation transcription system. This system is developed by using the rule-based character to phonetic mapping to produce the transcriptions in the International Phonetic Alphabet (IPA) format. This work can be improved by identifying additional transcription rules. In future work, we shall focus on improving the accuracy of the phonetic transcription system by adding more transcription rules. Further, with improved accuracy, this system can lead to the creation of a phonetic dictionary which shall act as an essential component for developing the TTS and ASR system for the Konkani language. Also, building the complete ASR system is possible by creating acoustic and language models.