Keywords

1 Introduction

Our goal is to construct a syntactic grammar that recognizes all the grammatically correct paraphrases of a Quechua sentiment sentence like {Gervasio Romildata kuyan/Gervasio loves Romilda}. As Languella [10], I adopt Harris’s concept of paraphrase, which Harris [9] borrowed from the concept of morphism in mathematics. A morphism is a structure-preserving map from one mathematical set of objects to another. According to Harris, a sentence is a paraphrase of another one if a change occurs in the morphophonemic shape of the transformed sentence while preserving the original lexical morphemes and meaning. Paraphrasing has been used by many authors, such as Barreiro [1, 2], who used it to prepare texts for machine translation; Ben et al. [3], who used it for bilingual MT for Arabic-English translation of relative clauses; or Fehri et al. [8], who used it for Arabic–French translation of named entities.

Silberztein [11, 12] shows how, by combining a parser and a generator, and applying them to a syntactic grammar, one can build a system that takes one sentence as its input, and produces all the sentences that are morpho-syntactically or semantically related to the original sentence, or share the same lexical material with it. Figure 1 shows our NooJ grammar for the passive transformation:

  • [Passive]{Gervasio Romildata kuyan/Gervasio loves Romilda} = {Romildam Gervasiopa kuyasqan/Romilda is loved by Gervasio}.

    Fig. 1.
    figure 1

    [Passive]: {Romildam Gervasiopa kuyasqan/Romilda is loved by Gervasio}

The graph of Fig. 1 uses three variables, $N0 (the subject S), $N1 (the object O) and $V (the verb V). When parsing the sentence {Gervasio Romildata kuyan (N0 N1 V)/Gervasio loves Romilda}, the variable $N0 stores the word {Gervasio}, $N1 stores the word {Romilda} and $V stores the word {kuyan/loves}.

The grammar output “$N1_m $N0_#pa $V_V+PPA+s+3” produces the string {Romildam Gervasiopa kuyasqan/Romilda is loved by Gervasio}, where V+PPA+s+3 symbolizes the conjugated form past participle of the verb {kuyay/to love}, third person singular.

2 Syntactic PoS Transformations

2.1 Noun Transformations

In order to write a grammatical transformation of a sentence in Quechua, an agglutinative language, one needs to take into account the morpho-syntactic transformations of each PoS included in the sentence. This is because each PoS may be inflected or transformed by agglutination of one or more suffixes, acting as operators, coming from their corresponding class of suffixes.

For instance, Calisto, a proper name, may be inflected, as follows: {Calistom/it is Calisto} (-m is the suffix of assertion), {Calistos/people say that it is Calisto} (-s is the suffix of hearsay), {Calistoqa/It’s Calisto} (-qa is the suffix of topic), {Calistochá/It is probably Calisto} (-chá is the suffix of uncertainty), etc.

All these inflected forms are obtained applying N-suffFootnote 1 operators using the NooJ grammar of Duran [7]. Many binary or ternary combinations of these suffixes may also generate grammatical transformations of (N). Figure 2 shows part of the grammar that generates the mono-suffix transformations of the proper name Calisto.

Let us consider the following direct sentence: {Calisto wasinta llimpin/Calisto paints his house}. Replacing the proper noun Calisto by some of its inflections, we obtain transformed sentences like:

  • {Calistos wasinta llimpin/they say that Calisto paints his house}

  • {Calistoqa wasinta llimpin/It’s Calisto that paints his house}

  • {Calistochá wasinta llimpin/it is probable that Calisto paints his house}

    Fig. 2.
    figure 2

    Mono-suffix transformation of a subject

But this grammar may also generate some ungrammatical forms, such as {*Calistop wasinta llimpin/*Calisto’s paints his house}, erroneous because Calistop is in the genitive case. That’s why we need to add syntax constraints to this grammar in order to obtain more accurate outputs.

Furthermore, in Quechua, we may also get transformations by combining two or more N-suff suffixes. For instance, according to Duran [5 p. 73], using two agglutinated nominal suffixes, we obtain more than 320 grammatical transformations of a proper noun, like: {Gervasio-cha-m llimpin/little Gervasio does paint}; {Gervasio-raq-mi llimpin/Gervasio paints first}; {Gervasio-lla-m llimpin/only Gervasio paints}; etc.

For three agglutinated nominal suffixes, we obtain more than 720 grammatical transformations, such as {Gervasio-cha-lla-m llimpin/it’s only little Gervasio who paints}; {Gervasio-cha-lla-s llimpin/they say that it’s only little Gervasio who paints}; {Gervasio-nchik-lla-s llimpin/they say that it’s only our little Gervasio who paints}; etc.

2.2 Pronoun Inflections

For a pronoun like {pay (third person singular)} we will have the following transformations:

  • {paymi/it is him (assertion)}, {paysi/people say that it is him (hearsay)},

  • {payqa/concerning him (topic)}, {paychá/It is probably him (uncertainty)}.

The class of pronominal suffixes contains 39 elements, which implies that we may obtain 39 mono-suffix inflections. However, only 28 of these suffixes generate transformed nouns that can be used as the subject of the transformed sentence.

If we take again the sentence: {Calisto wasinta llimpi n/Calisto paints his house} and replace the proper noun Calisto by some of these transformed pronouns, we obtain sentences like these:

  • {Paysi wasinta llimpin/they say that he paints his house}

  • {Paychá wasinta llimpin/it is probable that he paints his house}.

2.3 Verb Derivation

For any Quechua verb, we have two classes of suffixes which derivate verbs: interposition suffixes (IPS)Footnote 2 and postposition suffixes (PPS)Footnote 3.

Fig. 3.
figure 3

Kuyay (to love) derived by two-dimensional combinations of IPS suffixes

Taking the sentiment verb {kuyay/to love}, and parsing it with the grammar of Fig. 3, we obtain 624 new derived verbs with {kuya} as the lemma and a combination of two SIP suffixes. They look like in the following sample:

  • # Dictionary generated automatically

  • kuyarichkay, V+FR=“to start loving”+FLX=V_SIP_INF+DYN+CHKA+INF;

  • kuyapayarquy, V+FR=“to love repeatedly in a short time”+FLX=V_SIP_INF+FREQ+RQU+INF;

  • kuyapayariy, V+FR=“to love repeatedly in a delicate manner”+FLX=V_SIP_INF+FREQ+RI+INF;

  • kuyaykachamuy,V+FR=“to love in a dispersed manner”+FLX=V_SIP_INF+ARO+MU +INF; …

For three-layer combinations of IPS suffixes, we apply the algebraic grammar V_SIP_INF = :V_SIP1_INF|:V_SIP2_INF|:V_SIP3_INF; for details, see Duran [5], which generates 3,006 new derived verbs, in the infinitive form, ready to be derived or conjugated, as shown in the sample below:

  • kuyaparuchkay,V+EN=“to keep loving in an intensive manner”+FLX=V_SIP_INF+PEAU+PRES+PROG+INF;

  • kuyaykachapayamuy,V+EN=“to love repeatedly in a dispersed manner”+FLX=V_SIP_INF+DISP+FREQ+ACENT+INF;

  • kuyapayaykarikuy,V+EN=“to love mutually repeatedly and with care”+FLX=V_SIP_INF+FREQ+PONC+AUBE+INF;

3 The Verbal PPS Transformations

For obtaining one layer of PPS inflections of a verb, we apply the grammar of Fig. 4.

Fig. 4.
figure 4

rimay (to talk) inflected by one-layer postposition suffixation (PPS)

To generate all the binary PPS transformations of a verb, following the guidelines given in the work of Duran [5, pp. 255–257], I have constructed the NooJ grammar

  • V_PPS2 = : PPS2 | :PPS2_F;

where PPS2 contains 72 agglutinations for the present tense, such as the following: {chusinam | manchá | manchik … | punis | punitaq | … raqchik | raqchu? | taqchu | taqmá | taqmi | taqsi | taqyá}. PPS2_F corresponds to the future paradigms. This grammar produces 1,008 binary PPS transformations of the verb kuyay, such as those appearing in the following sample:

  • kuyankutaqchá, kuyanikuchusinam, kuyanikuraqchá, kuyankipuni chu, kuyankipunichu?, kuyankirajsi, kuyaniñachik, kuyanipunimá, kuyairaqyá, kuyaikumanraq,…

One of the important properties of verb transformation in Quechua is that one can compose IPS and PPS transformations in order to obtain mixed transformations like this one:

  • {rima-yku-ni-raq-mi (rimaykuniraqmi/First of all, I presented my greetings)}

where we have one IPS (-yku-) between the lemma and the ending ni, the first person singular and two PPS (-raq, -mi) after the ending.

The grammar of Fig. 5, constructed following Duran [5, pp. 257–258] and Duran [4], generates mixed transformations. In the example below, I apply a mixed-grammar composed of two-dimensional PPS and one-dimensional IPS transformations, to the verb {rimay/to talk}. We use the grammar V_MIX12, which generates all the transformations containing 1 IPS transformation and two PPS transformations. After parsing, we obtain 17,280 mixed transformed verbal forms like in the sample below:

  • rimakamunraqmi,V+EN=“to talk”+FLX=V_MIX12+AOL+s+3

  • rimakachankichikmanpas,V+EN=“to talk “+FLX=V_MIX12+ARO+p+2

  • rimakachankichikpaschá,V+EN=“to talk “+FLX=V_MIX12+ARO+p+2

  • rimakachankichikpaschik,V+EN=“to talk “+FLX=V_MIX12+ARO+p+2

Fig. 5.
figure 5

A mixed IPS-SPP transformation grammar for a direct transitive sentence

Let’s consider some examples of transformations including these mixed inflections of the verb to talk {Calisto mamanta riman/Calisto talks to his mother}:

  • {Calistos mamanta rimaykunraq/they say that, first of all Calisto talked to his mother}

  • {Calistoqa mamanta rimaykullanraq/concerning Calisto, he first of all talked to his mother with much respect}

  • {Calisto mamanta rimariykunqaraq/Calisto will talk whispering first to his mother}.

In Fig. 6, we show the syntactic grammar, which gathers these combined transformations. It is capable of recognizing more than one million transformed sentences of a direct transitive sentence such as {Gervasio Romildata kuyan/Gervasio loves Romilda}.

Fig. 6.
figure 6

A grammar which recognizes the transformations of “Gervasio Romildata kuyan

3.1 The Class of Sentiment Verbs

Before introducing our study on paraphrases of sentiment predicates, we will present our dictionary of sentiment verbs. The multilingual electronic dictionary (QU-FR-QU) that we have been building during the last eight years has been enhanced with new linguistic information (namely, inflectional, derivational and morpho-syntactic properties) as well as with some semantic relations. This permits, among other linguistic phenomena, the generation of paraphrases. This dictionary includes the class of sentiment verbs, marked by “+sent”. Figure 7 shows an excerpt of this class of verbs (QU-FR).

Fig. 7.
figure 7

Excerpt of the class of QU sentiment verbs.

3.2 Elementary S-Transformation of a Sentiment Sentence

Let us consider the declarative initial sentence of the type N0N1V:

$$ \left\{ {Gervasio\;Romildata\;kuyan/{\text{Gervasio}}\;{\text{loves}}\;{\text{Romilda}}} \right\} $$
(1)

which we want to paraphrase. We present some elementary transformations of a SOV sentence.

  1. (1)

    The permutation N1_V: [PermN1_V]

In QU it is possible to permute the verb and the object without modifying the semantics of the sentence:

  • [PermN1_V] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Gervasio kuyan Romildata/Gervasio loves Romilda},

    which can be symbolized as: [PermN1_V] (N0N1V) = N0 V N1

  1. (2)

    The permutation N0_V: [PermN0_V]

It is also possible to permute the verb and the subject without modifying the semantics of the sentence:

  • [PermN0_V] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Kuyan Gervasio Romildata/Gervasio loves Romilda},

    which can be symbolized as: [PermN0_V](N0N1V) = V N0 N1

  1. (3)

    Pronominalize the subject N0: [ProN0]. We will have:

    • [ProN0] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {pay Romildata kuyan/He loves Romilda},

      which can be symbolized as: [ProN0](N0NV) = pay N1 V

  1. (4)

    Pronominalize the object N1: [ProN1]. We will have:

    • [ProN1] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Gervasio payta kuyan/Gervasio loves her},

      which can be symbolized as: [ProN1](N0N1V) = N0 payta V

  1. (5)

    Pronominalize both the subject and the object N0, N1: [ProN0N1]

We will have:

  • [ProN0N1] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Pay payta kuyan/He loves her},

    which can be symbolized as: [ProN0N1](N0N1V) = pay payta V

  1. (6)

    Nominalize the verb V: [Vnom_i]. We will have:

  • [Vnom_i] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Romildam Gervasiopa kuyainin/Romilda is Gervasio’s love},

    which can be symbolized as: [Vnom_i](N0N1V) = N1m payta V

  1. (7)

    Nominalize the verb V: [Vnom_j]. We will have:

  • [Vnom_j] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Romildam Gervasiopa kuyaqnin/Romilda is Gervasio’s lover},

    which can be symbolized as: [Vnom_j](N0N1V) = N1m N0pa V+NV+POSC_c+3+s,

    where NV+POSC_c+3+s stands for: nominalized verb as agentive and possessive, in the third singular person.

  1. (8)

    Passive of verb V: [Passive] or The Passive transformation. We will have:

  • [Passive] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Romildam Gervasiopa kuyasqan/Romilda is loved by Gervasio,

    which can be symbolized as: [Passive](N0N1V) = N1m $N0pa $V_V+PPA+s+3

  1. (9)

    Cleft operator: [Cleft_0]

Here, a single message is divided into two classes. We will have:

  • [Cleft_0] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Gervasiom kachkan Romilda kuyaq/lt is Gervasio who loves Romilda},

    which can be symbolized as: [Cleft_0](N0N1V) = N0m kachkan N1 V+NOM_V+QS

  1. (10)

    Cleft: [Cleft_1].

  • [Cleft_1] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Romildatam Gervasioqa kuyan/lt is Romilda that Gervasio loves},

    which can be symbolized as: [Cleft_1] (N0N1V) = N1tam N1qa V+PR+3+s

  1. (11)

    Cleft: [Cleft_2]

  • [Cleft_2] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Gevasiom Romilda kuyaqqa/Gervasio is the one who loves Romilda},

    which can be symbolized as: [Cleft_2] (N0N1V) = N0m N1 V+V+NOM_V+QS+THE

  1. (12)

    Adverb: [ADV_V]

We will have:

  • [ADV_V] {Gervasio Romildata kuyan/Gervasio loves Romilda} = {Gevasio Romildata achkallataña kuyan/Gervasio loves Romilda really a lot},

    which can be symbolized as: [[ADV_V]] (N0N1V) = N0 N1ta achkata V+PR+3+s.

Table 1 shows a partial list of these transformations.

Table 1. Elementary transformations of {Gervasio Romildata kuyan}

For many of these elementary transformations, it is possible to construct grammars that perform the reverse operation, as we can see in Fig. 8.

Fig. 8.
figure 8

Grammar for the passive-inv operation of {Gervasio Romildata kuyan} → Romildam Gervasiopa kuyasqan

4 Composition of Elementary Paraphrasers

In order to obtain composed paraphrasers C_PARPH, it is possible to construct a sequential composition of two or more elementary paraphrasers. They should, of course, respect certain syntactic constraints. Below are some examples.

  1. (1)

    As said before, one or more of the PoS contained in each of these elementary transformations may be derived or inflected separately. For example, for (12) we may get {Gevasio Romildata achkallataña kuyan/Gervasio loves Romilda really a lot}, where the noun and the verb have been derived.

  2. (2)

    The 12 operators presented earlier can be sequentially applied to a sentence, respecting the syntactic rules, in order to obtain composed paraphrasers, such as the following:

    • PermN1_V+ProN0

    • PermN1_V+ProN1

    • PermN1_V+ProN0N1

    • PermN1_V+ADV_achka

    • PermN0_V+PermN1_V

    • ProN0+PermN1_V

    • ProN1+PermN1_V

    • ProN0N1+PermN1_V…

To apply the first composed paraphraser of the list, [PermN1+ProN0], we apply first [ProN0]:

  • [ProN0] {Gervasio Romildata kuyan} = {pay Romildata kuyan}

Then we apply PermN1_V:

  • [PermN1_V] {pay Romildata kuyan} = {pay kuyan Romildata}

We may now compose three elementary transformations, such as below:

  • [PermN1_V+ADV_achka+ProN0] {Gervasio Romildata kuyan}. We will have the following sequence of results:

    ProN0::

    pay Romildata kuyan

    ADV_achka::

    pay Romildata achkata kuyan

    PermN1::

    pay Romildata kuyan achkata

  • [PermN1_V+ADV_achka+ProN0] {Gervasio Romildata kuyan} = {pay Romildata kuyan achkata}.

    Fig. 9.
    figure 9

    Elementary paraphraser for an SOV sentence

The grammar of Fig. 9 concatenates these compositions. When applied, it generates 2,940 paraphrases like those appearing under the graph.

5 Conclusion

In this paper, I have shown several NooJ grammars capable of recognizing and producing a large number of sentences that are generated by transformations from an initial direct sentence. In particular, I presented grammars corresponding to elementary transformations of any direct sentiment predicate like {Gervasio Romildata kuyan/Gervasio loves Romilda}. By composing elementary paraphrasers, I have built a complex paraphraser which generates a large number of paraphrases.

I plan to construct a more comprehensive set of transformations and paraphrasing grammars, which I hope will help me in the implementation of the resources for our machine translation project for Quechua.