Keywords

1 Introduction

The inventory of Quechua verbs, carried out on the existing paper dictionaries, shows us a lexicon of less than 1400 simple verbs.

In a previous articleFootnote 1 we have reported that we have been able to increase this list to near 2000 by the addition of some hundreds of verbs, obtained them by parsing some NooJ grammars on our corpus in which they were imbedded in the form of derivations using compound suffixations.

For example the form asirichiy appears in the corpus translated as: “make him smile”. We notice that it contains the compound suffix -ri-chi- which can be analyzed as follows. chi: factitive, make someone do something ri: dynamism, to start doing the action defined by the verb. The remaining morpheme asi- is the lemma of the quechua verb to laugh. Thus asiy: to laugh has been derived by the suffixes -ri-, -chi- to give the new verb to smile.

Let us see another example: the form rantikuy appears translated in the corpus as: to sell. We notice that it contains the suffix -ku-. It can be analyzed as follows: The suffix -ku-: auto benefic, has induced on the original lemma ranti- to buy (acquiring something), a change of semantic field into to sell (get rid of something).

2 Generation of Quechua Verbs

We might be inclined to think that the fact of having less than 1400 simple verbs could have been a handicap for the production of any kind of extensive literature like French for instance, which has several thousands of verbs. For this language, Dubois, Jean et Dubois-Charlier, Françoise (D&D) have inventoried more than 25000 entries in their dictionaryFootnote 2 «Dictionnaire électronique des verbes (français)»

But, Quechua presents a remarkable strategy for generating new verbs by derivation of the simple ones as we have just seen. For this, it makes use of a set of 26 interposition suffixes IPSFootnote 3. To illustrate this, let us take the simple verb llamkay (to work), which is formed by the verbal lemma llamka- and the infinitive suffix -y. Interposing the suffix -isi- between them we obtain the derived verb llamka-isi-y (to help someone to work). Joining to the same lemma the suffix –chka- (the action is being executed, it’s similar to the role of the progressive particle ing in English), we obtain the new verb llamka-chka-y (to keep working). The parsing of the NooJ grammar of Fig. 1 or its algebraic expression V_SIP1_INFFootnote 4 on the set of 1400 verbs generates 364000 compound Quechua verbs. Besides the examples, some of these are lexicalized well known verbs like the ones appearing in Table 1.

Fig. 1.
figure 1

The NooJ grammar that generates one dimension compound verbs using the 26 inter positioned suffixes IPS

We might be inclined to think that, the fact of having less than 1400 simple verbs could have been a handicap for the production of any kind of extensive literature like French for instance, which has several thousands of verbs. For this language, Dubois, Jean et Dubois-Charlier, Françoise (D&D) have inventoried more than 25000 entries in their dictionaryFootnote 5 «Dictionnaire électronique des verbes (français)»

But, Quechua presents a remarkable strategy for generating new verbs by derivation of the simple ones as we are going to see. For that, it makes use of a set of 26 interposition suffixes IPS5. To illustrate this, let us take the simple verb llamkay (to work), which is formed by the verbal lemma llamka- and the infinitive suffix –y. Interposing the suffix-isi- between them we obtain the derived verb llamka-isi-y (to help someone to work). Joining to the same lemma the suffix -chka- (the action is currently executed, it is similar to the role of the progressive particle ing in English), we obtain the new verb llamka-chka-y (to keep on working). The parsing of the NooJ grammar of Fig. 1 or its algebraic expression V_SIP1_INFFootnote 6 on the set of 1400 verbs generates 36 4000 compound Quechua verbs. Some of these are lexicalized well known verbs like the ones appearing in the following table.

Table 1. Lexicalized compound Quechua verbs

But most of them are relatively unknown ones as we will see soon.

2.1 Combinations of Two Interposed Suffixes

The Quechua grammar allows agglutinations of IPS and consequently to obtain more new verbs. We can have combinations of two or more of them. For instance the combination -chka-isi- which can be added to the lemma llamka- to obtain the new verb llamka-chka-isi-y (to keep helping someone to work). However the permutated combination *–isi-chka- is not grammatically correct. To answer the question of which two-fold combinations are grammatical, we have built manually the matrix of Fig. 2 based on field work. There, the valid combination is noted by 1 and the invalid ones by 0.

Fig. 2.
figure 2

Boolean matrix of two-fold IPS combinations

We have verified that at least 292 are grammatically correct new verb generators. Here is a sample of the resulting agglutination of 2 dim IPS compounds:

CHIMU = :CHI :MU;

CHIPU = :CHI :PU;

CHKAIKAMU = :CHKA :IKAMU;

CHKAIKAPU = :CHKA :IKAPU;

IKURQU = :IKU :RQU;

ISICHI = :ISI :CHI;

Which allows us to write Nooj grammars to generate new 2-fold IPS compound verbs:

V_SIP2_INF = <B > (:CHICHI |:CHICHKA |:CHIIKACHI |:CHIIKAMU |:CHIIKAPU |:CHIIKARI |:CHIIKU |:CHIISI |:CHIKAMU |:CHIKU |:CHILLAV |:CHIMU |:CHIPU |:CHIRPARI |:CHITAMU |:CHKAIKACHI … |:IKAMUCHKA |:IKAMUIKACHA |:IKAMUIKACHI |:IKAMUIKAPU |:IKAMUIKARI |:IKAMUISI |:IKAMUKACHA |:IKAMUKU |:IKAMULLAV |:IKAMUNAYA |:IKAMUPAYA)y/INF;

Fig. 3.
figure 3

3-dimension grammatical verb-generators

Moreover, these combinations are capable of generating three fold agglutinations by adding again one IPS. The respective Boolean matrix contains 7592 entries, but not all are grammatical. Manual verification yields only 2952 “1’s”, i.e. grammatically correct compounds. Figure 3 shows a sample of the last 22 of this list.

V_IPS4

chi-iku-na-lla

ku-lla-chka-rqa

chi-isi-mu-chka

Agglutinations of five dimensions

V_IPS5

chi-ku-na-lla-pti

chi-ku-lla-wa-pti

chi-isi-mu-chka-pti

3 The Compound Verbs

Agglutinating these grammatical compound suffixes to the 1400 simple verb lemmas we obtain 43160 grammatically correct compound verbs. In Fig. 4 we present a sample of them.

Fig. 4.
figure 4

A sample of the generated dictionary of 43000 compound verbs

Table 2. Neologisms proposed instead of loans in use

4 The Semantics of the Agglutinations

But, grammatically correct forms do not necessarily mean meaningful forms. For instance what is the precise meaning of the verb llamkarachitamuy? or tiyarachitamuy, where llamka-: to work and tiya-: to sit are the lemmas and rachitamuy is the valid combination of the suffixes -ra-, -chi-, -tamu-.

What are the actual meanings of this amazing quantity of the generated new verbs? Are they really currently used by the native speakers? Which ones are really meaningful in the language?

Table 3. Semantic values for suffixes IPS

Many are certainly candidates to become neologisms like in the following table:

But many others seem not to have a plausible meaning.

We are aware that the only way to answer these questions is by hand verification on the field, nevertheless to ease this task we have written some NooJ grammars which give us, as a first step, the corresponding annotations of the suffixes contained in the verbal form, like in Fig. 5. Then, it proposes automatically the glossed translation. For this, we have first inventoried the IPS suffixes and their corresponding main semantic values as it appears in Table 3.

Where for the first one CHI, we have three factitive values (in English and French for this suffix but only in French for the rest)

FACT_1 : the subject aids, helps le_sujet_assiste, aide

FACT_2 : the subject invites, authorizes, incites le_sujet invite, authorise, incite

FACT_3 : the subject forces, commands a third party to do the action; le sujet oblige, commande à un tiers à réaliser l’action.

Fig. 5.
figure 5

Annotated 3-dim 2931 verbs derived from the verb to love

5 Proposing Automatic Transductors from Quechua Compound Verbs into French

Using Table 2 we have written some NooJ grammars to annotate the 3-dim 2931 verbs derived from the verb to love kuyay as we see in the sample of Fig. 5.

We have searched plausible meanings for the generated compound verbs by applying on the annotated forms the semantic values of Table 2. We show some results of this approach in Fig. 6 for the derivations of the verb rimay to talk:

Fig. 6.
figure 6

Glossed meanings for some compound verbs derived from rimay to talk

After verification of the pertinence of these glossed outputs we may propose the possible meaning for the compound verb, like in the following examples:

ayqiriy,ayqiy,V + FR =“échapper” + FLX = V_SIP_INF + le_sujet_commence_à_DYN_2_recommence_à_réalise_l_action + INF

aisariy,aisay,V + FR = “tirer” + FLX = V_SIP_INF + le_sujet_commence_à_ DYN_2_recommence_à_réalise_l_action + INF

which could be interpreted as: the subject starts towing something, and so aisariy should be: to tow, as it has been actually lexicalized.

rimaikuy,rimay,V + FR = “parler” + FLX = V_SIP_INF+le_sujet_courtoisement_COURT_2_soigneusement_COURT_3_amicalement_COURT_4_vers_le_sujet_réalise_l_action + INF

which means (the subject) talks someone courteously, carefully, friendly, which could in fact has been lexicalized as: to greet

These meanings may be opposed to the existing lexicalized entries that we have gathered out of our corpus. We see that for the three first ones, they match well:

Table 4. Automatic glossed translation compared to lexicalized entries

aiqiriy,V + FR = “commencer à fuir, entreprendre un retrait” + SP = “comenzar a huir, emprender la retirada” + FLX = V_TR

rimaikuy,V + FR = “adreser la parole à qqn avec courtoisie” + SP = “dirigir la palabra a alguien atentamente” + FLX = V_TR

amichiy,V + FR = “faire qqn s’ennuyer” + SP = “hacer aburrir a alguien” + FLX = V_TR

aiqiriy,V + FR = “se retirer lentement à une petite distance” + SP = “retirarse lentamente a pequeña distancia” + FLX = V_TR

asiriy,V + FR = “sourire” + SP = “sonreír” + FLX = V_TR

asnariy,V + FR = “commencer à sentir (la viande)” + SP = “comenzar a oler (carne)” + FLX = V_TR

yaikuriy,V + FR = “entrer un peu, un moment, et aussi, entrer en étant de passage” + SP = “entrar un poco, y también: entrar estando de paso” + FLX = V_TR

In the next table we show some more comparisons for other verbs (Table 4).

6 Results

As a result of this hand verifications carried out on some hundreds of cases, we have elaborated a trilingual dictionary (Qu, Fr, Sp) of Quechua compound verbs. It contains 1600 entries which can be added to our 1400-simple verbs lexicon. It includes their Spanish and French translations. We present below a sample of the entries of this dictionary (Fig. 7):

Fig. 7.
figure 7

A sample of the entries of the dictionary of compound Quechua verbs

7 Text Annotations

With the help of this dictionary and some NooJ grammars like V_SIP1_INF presented before we may automatically annotate a Quechua text. We applied them on a collection of eight Quechua tales. We show in Fig. 8 the annotated correspondences obtained. We have found around 90 % of successful matches, 6 % of partial matches and 4 % are incorrect matches, mainly because of ambiguities.

Fig. 8.
figure 8

Recognition of compound verbal forms in one text of the corpus

8 Conclusion

We have studied the key role of inter positioned suffixes IPS, for the generation of new Quechua verbs. After the study of thousands of combinations we have found altogether 3249 valid compounds of up to three IPS suffixes which will generate that amount of new verbs out of a single one. This considerably increases the verb lexicon. In fact parsing the NooJ grammar V_SIP_INF on our dictionary of around 1400 simple verbs gives us 43160 new compound Quechua verbs. With the help of morpho-syntactic NooJ grammars and the use of the semantic annotations corresponding to the IPS suffixes we propose a glossed form in order to figure out the meaning of these verbs.

  • Perspectives

  • Increase the compound verb bilingual dictionary.

  • Improve our grammars to obtain less ambiguous translations