Keywords

1 Introduction

NLP (Natural Language Processing) is a branch of AI (artificial Intelligence) studies the problems of automated generation and understanding of natural human languages. The goal of NLP is to design and build software that will analyze, understand, and generate languages that humans use to communicate with each other. Natural Language Processing (NLP) is a convenient description for all attempts to use computers to process natural language. Natural Language Processing (NLP) includes Natural Language Understanding (NLU) and Natural Language Generation (NLG). Study of auxiliaries from a morphological perspective is extremely interesting from the semantic and pragmatic points of view, and they still await detailed and careful study. The comparative study would yield rich information regarding the semantic structure of languages, apart from being useful for translation systems, etc. Auxiliaries are the helping verbs. Auxiliary verbs are used in conjunction with main verbs. Auxiliary verbs usually accompany the main verb. The main verb provides the main semantic content of the clause. An example is the verb have in the sentence I have finished my dinner. Here, the main verb is finished, and the auxiliary “have” helps to express the perfect aspect. To know state of the art, we carried out the literature survey.

2 Literature Survey

The syntax of auxiliaries has given rise to much discussion in the generative literature [1] discussed finite-state transducers based system for Hindi, [2] have discussed the auxiliaries from a generative perspective [3, 4] focus on syntactic structures. With respect to auxiliary [5], have focussed on serial verb construction using auxiliaries [6]. Focussed on the syntax of valuation in auxiliary-participle constructions [7] has focussed on verb phrase use of ellipsis, phases and the syntax of morphology.) [8] has discussed issues conserving the Clitics, morphological merger, and the mapping to phonological structure, [9] has discussed the Kannada morphology aspect concerning syntax and semantic of Kannada Language.

2.1 English Auxiliaries

English auxiliary verbs or helping verbs such as will, shall, may, might, can, could, must, ought to, should, would, used to, need are used in conjunction with main verbs to express shades of time and mood. The combination of helping verbs with main verbs creates what are called verb phrases in English. In English, shall is used to express simple future. The different forms of the has. have used to express tenses present perfect and past perfect. There is also a separate section on the Modal auxiliary such as can, could, may, might, must, ought to, shall, should, will, and would, do not change the form of different subjects. For example “I can write". in this example, the modal auxiliary express various meanings of necessity, advice, ability, expectation, permission, possibility, etc.

3 Proposed Methodology

The auxiliaries occupy more than 75% of corpus file, hence analysis and handling of Kannada auxiliary is crucial for translation purpose, the methodology is shown in Fig. 1. The input module reads the input and tokenizes the input sentence in to words and each word is converted into transliteration module using ir.pl program, each transliterated word is searched in the dictionary, if word is found then tag in the dictionary is assigned to it, otherwise the word is passed to next module to check for the morphological inflections. The morph module analysis the input and gives output with part of speech tags.

Fig. 1
figure 1

Proposed architecture for auxiliary analysis

3.1 Kannada Auxiliaries

Kannada auxiliaries can be divided in to aspect auxiliaries and modal auxiliaries. However, the auxiliaries in English occur as free morphemes and easy for analysis. But in Kannada they occur as bound morpheme suffixed along with verb with which they occur and thus yield rich information in semantic structure of Kannada language.

3.2 Aspect Auxiliaries

The general occurrence of aspect auxiliary is past participle form of the verb followed by aspect auxiliary. Aspect auxiliary always preceded by past participle form. (past verbal participle + Aspect auxiliary verb). Another essential distinction between Kannada auxiliary and English auxiliary is the verbs which are acting as aspect markers have used as the main verb. Also, this is not true with English auxiliaries like has, shall etc. The example illustration is given in Table 1.

Table 1 Inventory of aspect auxiliaries verbs in Kannada

Kannada has a set of verbs that may be added to verbal participle to give certain semantic nuances to the meaning of the sentence. Aspect markers are very similar to the main verbs in their morphology and syntax. In fact, they are derived from certain main verbs. But semantically they do not express the lexical meaning like their main verbs express in their auxiliary aspect usage. The aspectual biDu ‘completive’ does not mean the same as main verb biDu ‘leave’. Consider an example of verb formation by adding a set of the auxiliary, as shown in Table 1. In Kannada thousands of such verb forms can be generated. That is why the study of auxiliaries in Dravidian languages is a challenging task. In Kannada language, the verbs which are acting as aspect auxiliary are actually the main verb. Apart from their usage as the main verb, these verbs are acting as auxiliaries also. This is not true for the English language. Consider an example below.

figure a

Here bareduko Du is a though looks like a simple single verb. But it is the complex verb, which is formed by the addition of aspect auxiliary ‘koDu’ to the baerdu form, which is a non-finite (past verbal participle form) of verb ‘bare’ (write). This kind of formation of verb leads to lakhs of complex verbs. But in reality, there are only around 2000 basic verb roots. Another thing is the complex verbs formed by this process follow the TAM (tense, aspect and modality) inflections to the auxiliary verb and not to the first verb. But however, the meaning of such formation is inferred from the first verb itself. In the below example in Table 1 “tiMdubiDu”. The meaning here is eaten itself; it is not like ate and then left. This process of complex verb formation is productive and regular in Kannada. It is not wise to store such complex verbs in the dictionary; instead, we have handled the formation of such complex verbs through our morphology. In currently existing morphological systems these verbs are stored as basic verbs and kept in the dictionary. Another speciality of Kannada is more than one aspect marker can be attached; this is not the case with English. Consider an example below.

figure b

In this example, 4 aspect markers are added; one can observe the complexity in word formation with respect to Kannada, the meaning of this word is inferred from first verb maaDu ‘do’. TAM inflections, PNG (person, gender, number) inflections follow last verb biDu (Leave). From the CIIL (central institute of Indian language) 3 million corpus, different auxiliaries are explored and are given in Table 1.

3.3 Modal Auxiliaries

Modal auxiliaries contribute different shades of grammatical meaning. The various possible modal auxiliaries are shown in Table 2.

Table 2 Modal auxiliaries verbs in Kannada

Modal auxiliaries are always attached at infinitive ‘al’ form.

figure c

The modal auxiliary in example 7 and 8, aagu and paDu denote passive constructions.

4 Observations and Conclusion

A Kannada Sample file Account4.aci.out the file from DoE CILL Corpus is selected for analysis, and it is observed that many words in the corpus are formed with auxiliaries and also modal auxiliaries. The analysis varies with the type of corpus. The occurrence of various auxiliaries in the corpus is shown in Fig. 2.

Fig. 2
figure 2

The occurrence of aspect auxiliaries in file Account4.aci.out

In the above corpus the auxiliary haaku ‘put’ has occurred maximum followed by nooDu and koDu, auxiliaries biDu and hoogu have not occurred at all. The occurrences various with the type of corpus used for testing. Similarly, the occurrences of modal auxiliaries in the file is also tested and shown in Fig. 3.

Fig. 3
figure 3

The occurrence of modal auxiliaries in file Account4.aci.out

Randomly five files F1, F2, F3, F4, F5 of size around 1000 words is taken for testing, and the result of auxiliary verbs is shown in Fig. 4. It shows that auxiliary beeku ‘must” have occurred maximum times followed by aagu, paDu andkuuDadu has not occurred. This variation depends on the corpus.

Fig. 4
figure 4

Aspect auxiliaries distribution CILL corpus sample files

The morphology of Kannada becomes complex due occurrence of auxiliaries as bound morpheme with the root or derivative stem. In contrast, they occur as free morphemes in English. We observe that 5 to 6 levels of aspect marker can be added to the derivate stem to form a complex verb root in Kannada. Auxiliary verbs help the main verb to denote the actions of the subject. They help in making compound words and passive voice statements. Auxiliaries are useful in serial verb construction. Morphology of auxiliary verbs like aspect auxiliaries and modal auxiliaries play a very important role as far as a Dravidian language like Kannada is considered. The morphological richness of any language is the presence of auxiliaries as bound morphemes.