Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

17.1 Introduction

The issues we address in this paper focus on the interface between lexical semantics and morphology. Just as lexical semantics can be viewed from a compositional point of view, morphology (that is, morpho-semantics) can be understood as a compositional semantics-constrained mechanism according to Corbin’s approach (Corbin 1987, 2001). These two levels of description give us distinct types of intrinsic information on the semantic content of derived words. One way to establish links between the two levels of description is to choose a common formalism for their representation.

Basing our work on the study of the prefixation by M. Aurnague and M. Plénat (1996, 1997), limited to the popular prefix é- in French, we (Jacquey and Namer 2003; Namer and Jacquey 2003) have suggested modelising the semantic role of this prefix within the framework of the Generative Lexicon Theory (Pustejovsky 1995). In this paper, we further this approach to modelisation of word formation mechanisms and apply this modelisation to account for another word formation (wf) process type in French, namely the NtoV versus VtoN conversion. Our reasons for focusing on the phenomenon of conversion are threefold.

First, it consists in a non-conventional mechanism because of the absence of any affix. The absence of such morphological mark is crucial as far as conversion orientation is concerned. On the basis of the pairs dance N /dance V and butter N /butter V for example, what has to be decided is whether the (output) verb is converted from the (input) noun or vice-versa. This decision will allow us to draw the definition of the output with respect to that of the input.

Second, verb and noun pairs related by the conversion process is of interest not only for wf research, but also for the lexical semantics. For instance, Goverment and Binding-oriented literature sometimes refers to it as to noun incorporation phenomenon (Hale and Keyser 1993), while Generative Lexicon Theory proposes to characterise their structure with the so-called shadow argument (Pustejovsky 1996).

Finally, this wf process is a multilingual phenomenon. It is both productive and frequent, and found at least in French and English.

Focusing on the French language, our formalisation proposal aims to account for the following aspects, as it will be discussed in this chapter: (1) a corpus-based analysis producing 2,500 homograph Noun/Verb pairs; (2) a ranking of these pairs according to semantic criteria; (3) a modelisation proposal stemming from the analysis of the most frequent and productive classes.

The rest of the chapter is structured as follows: Sect. 17.2 summarises briefly why and how we propose to modelise wf processes within the Generative Lexicon Theory; Sects. 17.3 and 17.4 focus on the conversion process itself. Section 17.3 compares hypotheses coming from linguistic theoretical studies with empirical results obtained by means of the corpus-based analysis mentioned above. Section 17.3 ends by a synthetic table which ranks conversion classes as a result of this comparison. According to these classes, Sect. 17.4 finally suggests two formal models for NtoV and VtoN conversion processes, respectively.

17.2 Word Formation Modelisation Within GL: MS-CS

This section focuses on modelisation backgrounds. First, Sect. 17.2.1 gives the linguistic theoretical background wf on which processes rely. Section 17.2.2 deals with the motivation for modelisation itself.

17.2.1 Theoretical Background: Corbin’s Approach to Word Formation

Among wf theories, research initiated for French in Corbin (1987) provides descriptions that put semantics in the forefront. More precisely, her wf theory is based on three statements:

  1. 1.

    Morphology is autonomous. In other words, the lexicon of the morphologically constructed words is generated by domain specific rules: wf rules and their outputs are independent of e.g. syntactic information. This statement agrees with e.g. (Aronoff 1976);

  2. 2.

    Morphology is regular, i.e. the morphologically constructed words lexicon is regular. Surface exceptions can always be given some explanation whether semantic, phonetic, diachronic, etc..;

  3. 3.

    wf rules associate several kinds of constraints, phonetic, semantic and categorial ones being the most important. It has been established that categorial conditions for wf can be derived from semantics (Corbin 2001; Dal 1997).Footnote 1

Consequently, Corbin’s theory foresees that part of the lexical meaning of a morphologically constructed word, called “lexical constructed meaning”, is built together with its constructed form.

Considering wf rules from this theoretical point of view clarifies the relationship between lexical semantics and morphology which rules wf, since a morphologically constructed word is above all a matter of semantic constraints. Constraints are exerted both on the base (called here input) and on the derived word (called here the output), through wf processes, which can be suffixation, prefixation, conversion or compounding processes. The lexical meaning of both the input and output are opposed to the meaning of the wf process itself, which can be seen as a computational (or instructional) device. In contrast with an input or an output, a wf process does not “mean” anything, but provides a guideline for the output meaning.

As stated in Sect. 17.1, in order to enable wf rules to be displayed as lexical semantics constraints, one way to proceed is to choose a common formalism of representation. The chosen formalism must be able to express semantic constraints at distinct levels, especially at syntactic and semantic levels, for any wf process. In addition, a given wf process may select only specific aspects of its input meaning, in order to build the meaning of the corresponding output.

The expressivity of the Generative Lexicon Theory (gl) makes it suitable to represent the just mentioned constraints. More precisely, gl is modular enough to integrate a level of morphological description and it is rich enough to constrain both input and output of a given wf process. The next section summarises how the gl-based wf mechanism has been set.

17.2.2 Formal Background: Our Approach to wf Modelisation

To achieve the goal of modeling wf in French, two basic approaches can be considered: (a) encoding the affixes themselves or (b) setting up abstract parametrised lexical units describing the outputs. One argument for the first choice would be the fact that affixes can be considered as some type of predicates operating on and controlling both the input and the output, from structural, categorial and semantic points of view. The first approach though is inadequate for two main reasons:

  1. 1.

    Encoding affixes to model wf would mean reducing wf to affix-based processes, and would consequently exclude both compounding and conversion. Keep in mind that the latter consists, loosely speaking, in building new words by means of a simple part of speech (pos) changeFootnote 2;

  2. 2.

    The very nature of affixes is another counterargument. According to the morphological theory our study is based on, an affix does not belong to any of the major pos categories. In addition, it bears no referential meaning: consequently, it does not seem logical to modelise its semantic content since it has no proper semantic content.

Thus, in previous studies (Jacquey and Namer 2003; Namer and Jacquey 2003), we turned to the second approach: namely, designing an abstract model which is intended to define the common properties shared by the outputs of a given wf process,Footnote 3 whatever the involved morphological process. This abstract lexical unit (alu) is instantiated through the input content of the wf process, which provides thus the abstract output with distinctive, specific properties.

In order to constrain the combination of an alu with and only with licensed inputs, we have decided to add a new attribute-value pair at the most alu embedding level: this pair, encoding the required semantic features of the wf inputs, is referred to as the morphological structure (ms).

Finally, in order to instantiate a well-formed constructed word content from the alu, we assumed one unification mechanism: the morphological structure composition schema (ms-cs).Footnote 4 Through ms-cs, only the candidate inputs with the appropriate features matching the ms content of the alu are selected for the formation of well-formed ouputs. This unification procedure also entails the instanciation of the right features on the output.

Based on the unification principle, the morphological structure composition schema (ms-cs) in (Fig. 17.1) governs the composition between a given abstract lexical unit and an actual input, in order to build the meaning of a well-formed output. In our conception, ms-cs is meant to be WF process-independent: among its arguments, alus are thus likely to represent any wf process, and actual words (both input and output) can belong to any major pos categories.

Fig. 17.1
figure 1

Composition Schema (ms-cs)

The ms-cs behavior is twofold. First, when the alu morphological structure (ms) unifies with the actual input, the success of this unification, noted by the \( \boxed{\text{ms}} \) index,Footnote 5 means that this input satisfies the constraints required by the alu ms. As a first consequence, relevant features are propagated into the appropriate alu structures, namely the argumental structure \( \boxed{\text{as}} \), the event structure \( \boxed{\text{es}} \) and the qualia structure \( \boxed{\text{qs}} \). Second, the ms-cs schema ensures the propagation of the updated features from the alu to the output in order to provide the latter with a well-formed semantic content.

The role played by our morphological device, in which ms-cs interact with an abstract lexical unit, can be viewed as a lexical semantics-driven modelisation of the so-called ‘word-based’ model in morphological theory. The choice of such a model (Bybee 1988; Koenig 1999; Fradin 2003) is opposed to ‘morpheme-based’ model (Selkirk 1982; Di Sciullo and Williams 1987; Lieber 1992). Unlike the latter, the word-based model is suitable for the description of non-concatenative word formation processes. In word-based models, the relationship between complex words is captured by formulating word-schemas which represent the common features of sets for morphologically related words. According to (Haspelmath 2002:47), “a word-schema is like a lexical entry in that it contains information on pronunciation, syntactic properties and meaning, but it may contain variables. In this way, it abstracts away from the differences between the related words and just expresses the common features”. A word-schema subsumes a set of words, that in turn match given schemas. Morphological relationships are therefore represented by correspondences between word-schemas. Word pairs that match correspondent schemas are thus related by a particular morphological relationship.

Hence it can be easily seen that alus formalise word-schemas and ms-cs represents morphological correspondences between schemas, which inputs and outputs have to match in order to instantiate actual morphologically related word pairs.

17.3 Data and Linguistic Description

Our aim is to reuse the ms-cs approach just discussed, in order, this time, to formalise the so-called conversion morphological process in French. This section covers the arguments for choosing this particular word formation type, together with a brief summary of its main linguistic theoretical properties. Next we will see how a corpus-based experiment is used to match these theoretical properties against large-scale observed characteristics. Finally a set of the most frequent, productive, and stable linguistic properties of noun to verb and verb to noun conversions results from this comparison.

17.3.1 Issues with Conversion

The morphological conversion process produces an output lexical unit (the convert) from an input lexical unit belonging to a different syntactic category (the base), without any morphological mark. The only visible mark on the output belongs to the inflectional paradigm characterizing its category.Footnote 6 In French, verbs (V) may be converted from nouns (N) (balai N [broom N ]→conv balay(er) V Footnote 7 [sweep V ]), or from adjectives (A) (vide A [empty A ]→conv vid(er) V [empty V ]); N may be converted from verbs (vol(er) V [fly V ]→conv vol N [flight N ]) or from adjectives (portable A [portable A ]→conv portable N [laptop N ]), the opposite rarely being true (orange Nconv orange A). Being unmarked,Footnote 8 this type of word formation entails the issue of the process orientationFootnote 9 i.e. there is no formal way to decide which one of N or V is the conversion output in e.g. balai, balay(er), vol and vol(er). Within the chosen wf theoretical approach, answering the orientation question amounts into making semantically driven decisions. In other words, detecting e.g. the NtoV versus VtoN conversion orientation means classifying Noun/Verb (quasi)homograph pairs according to a semantic relation.

17.3.2 NtoV Versus VtoN

The choice of focusing on the Noun/Verb pairs has been motivated by the presence of a large amount of such pairs, and by the high interest they gather within the linguistic community. In fact, according to Corbin (2004), there are no cases of V →conv A in French. Moreover the N →conv A type is limited to the production of chromatic adjectives derived from nouns referring to fruits or flowers (rose, orange…), and to the production of behavior adjectives converted from nouns referring to stereotypical animals (bête [beast N ], cochon [pig N ] …). Conversely, both N →conv V and V →conv N have been observed, in French as in other languages, even though morphology researchers (at least, the authors whose results are briefly reported below) do not often agree as far as conversion orientation is concerned.

A second argument, directly related to the first one justifying our choice is the semantic heterogeneity of verbs and nouns involved in conversion processes. Regardless, for the time being, of their possible role of input or output in the conversion process, let us notice that nouns may denote concrete (sucre N /sucr(er)V [sugar N/V ]), animate (singe N /sing(er)V [monkey N /mimic V ]), human (guide N /guid(er)V [guide N/V ]), or abstract entities (nage N /nag(er)V [swimming N /swim V ]); that verbs may describe instrumental (hache N /hach(er)V [axe N /chop V]), dissociative (plume N /plum(er)V [feather N /pluck V ]), or locative (coffre N /coffr(er)V [chest N /throw inside V ]) processes; and that they may belong to all kinds of eventualities: activities (crayon N /crayonn(er)V [pencil N /scribble V ]), transitions (transport N /transport(er)V [transport N /carry V ]), etc.

Last but not least, our interest in the Noun/Verb pairs is related to the fact that their linguistic description bridges together word formation and lexical semantics. Given that the Noun/Verb orientation is exclusively a matter of semantics, deciding for N→conv V or for V→conv N amounts to detecting the semantic properties on V and/or on N. This is in order (1) to check which of V or N is obtained from the other one, and, consequently, (2) to determine the semantic relationship holding N and V. This second point amounts to draw the definition of the output word by means of the input meaning. From these results, a (first attempt of a) semantic-based typology of NtoV and VtoN conversion should emerge, as we shall see below.

17.3.3 Theoretical Assumptions

Apart from the attempt of orienting NtoV from VtoN conversion according to phonological marks (see e.g. Katamba 1993 Footnote 10), literature regarding Noun/Verb conversion tries to give semantic motivations to their classification proposals. For Aronoff and alii (1984), the orientation has to do with thematic roles attached to V, and to which role N may, or may not play. For Mel’cuk (1996, 1997), some VtoN conversions are what he calls empty categorial conversions, which may occur between an input lexical unit and an output lexical unit with stronger distributional constraints than those of the input. On the other hand, non-empty categorial conversions are generally oriented according to the semantic inclusion relation between the involved lexical units X and Y: if the meaning of X is included in that of Y, then X→Y. Moreover, he proposes, following (Corbin 1987), a paradigmatic orientation Footnote 11 of Noun/Verb conversion: it is oriented in the same way as affixation with the same semantic relation. For instance, since –eur in French basically builds agents (nag(er) V-eur nageur N [swim V /swimmer N ]), and no other affix involves agents, all Noun/Verb pairs exhibiting an “agent” semantic relation should belong to that paradigm, and thus, for instance guid(er) Vconv guide N.

Among the assumptions briefly reported above, the paradigmatic orientation hypothesis seems to be the most promising: in fact, Aronoff’s relying on thematic roles would require a clear, stable and homogeneous definition of them, which is unfortunately not the case. As for Mel’cuk, he is neither able to define formally distributional constraints (which rules empty VtoN categorial constraint) nor semantic inclusion (which rules non-empty NtoV conversion).

However, paradigmatic orientation hypothesis is not a completely satisfactory solution. First, it does not account for pairs such as babouin N/babouiner V [baboon N /act as a baboon V ], in which imitation verbs depict the referent of the agent as acting in the same way as the referent of the base noun they are morphologically constructed from. Second, it leads to contradictory situations, e.g. when nouns refer to instruments. According to the paradigm, the conversion relation of Noun/Verb pairs should be V →conv N oriented when N denotes an instrument, since the only affixation process dealing with instruments in French are suffixes –oir and–eur, which both form deverbal nouns. Therefore, for –oir, we have for instance hach(er) V-oir hachoir N [chop V /chopper N ]. But for the same input, we notice that we also have hach(er) Vconv hache N , [axe N ] bearing (apparently) the same semantic relation. This is also the case with other N/N-oir or N/N-eur pairs: drain N /draineur N [drain N /drainer N ], gril N /grilloir N [gril N /griller N ]. The meaning variation between the compared nouns may indicate that Noun/Verb and N-oir/V or N-eur/V do not belong exactly to the same paradigm. A clear example of this is the case of the the verb agraf(er)V [staple V]: the noun agrafe N [staple N] refers to the concrete entity that performs the process itself; and the noun agrafeuse N Footnote 12 [stapler N ], the instrument which must be used so that these staples can do their job. If the instrumental paradigm cannot always be clearly stated, then there is no longer much evidence for the V →conv N orientation, when N is an instrument. Furthermore, D. Corbin partially reconsiders in later papers the overall paradigmatic hypothesis (Corbin 1997, 2004), mentioning instruments and instrumental verbs (scie N /sci(er)V [saw N/V ]) as NtoV conversion cases.

Be that as it may, we shall keep this paradigmatic assumption as a starting point. In addition to the NtoV wf processes, this hypothesis has also been the theoretical foundation for VtoN descriptions and analyses. One of the main contributors to these studies for French is F. Kerleroux: a very detailed analysis of converted deverbal nouns’ properties has been carried out by Kerleroux (1996a, b, 1997, 1999). Furthermore, Kerleroux (2004), Fradin and Kerleroux (2003a, b) redefine the notion of a lexeme. Consequently, they draw a set of conditions constraining VtoN conversion, using the differences these authors record between conversion and apocope, from both distributional and semantic points of view. VtoN conversion is also the object of study in Meinschäfer (2003); J. Meinschäfer proposes a set of criteria predicting the deverbal noun argument structure, with respect to that of the input verb. More precisely, she shows that deverbal nouns share with their input verb their aspectual and argumental properties, provided that the verb is not a causative event: Max recule la chaise [Max moves back the chair] → *Le recul de la chaise (par Max) [(Max’s) moving back of the chair]. However, she observes that causative and non-causative readings always alternate, thus allowing conversion: Max recule [Max is going back]→ le recul de Max [Max’s backward movement].

To sum up Noun/Verb conversion orientation possibilities, we can make the following assumptions:

  • N →conv V holds when (a) N is itself morphologically constructed (hachure [hatching N ]), (b) N is an instrument/substance used to perform the process described by V (scie [saw N ], sucre [sugar N ]), (c) N is the place where the process is performed (coffre [chest N ]), (d) N is the stereotypical agent of the process (singe [monkey N ]);

  • V →conv N holds, basically, when N is abstract; so N describes the process, its result or its product (transport); besides, N may denote the process agent (guide).

With this first classification in mind, let us now turn to the corpus analysis. We have collected a set of Noun/Verb homographs pairs, in order to (a) try to classify them according to the above criteria, (b) if (a) is not possible, to define new classes.

17.3.4 Corpus

To check the validity of the linguistic hypotheses performed above, we have collected the set of quasi-homograph verbal and nominal lexical units from a large-scale machine readable dictionary, mainly the TLFnomeFootnote 13 word list. Lexical units are labeled with the appropriate part-of-speech, and have at worst different endings, and allomorphic variants (e.g., changing thematic vowel aperture, graphically marked by a diacritic, e.g. with /B/→/ε/in rel e v(er) V /rel è ve N [pick up V /relief N ], or by doubling consonant e.g. with /T/→/Cn/in collisi on N /collisi onn (er)V [collision N /collide V ]).Footnote 14 A set of 2,500 Noun/Verb pairs have thus been gathered, half of which have been manually verified. The verification objectives are the following:

  • checking whether the paired elements are actually linked by conversion;

  • deciding for the conversion orientation, according to: (1) the theoretical assumptions given in Sects. 17.3.3 and 17.2 definitions and etymologies provided within dictionaries;

  • if needed, proposing new classes, or constraining the existing ones.

The conclusions of this large-scale verification are summed up in Sect. 17.3.5, from both a qualitative (ranking Noun/Verb pairs with respect to semantic classes) and quantitative (classifying Noun/Verb pairs according to their frequency) point of view.

17.3.5 Synthesis

Tables 17.1 and 17.2 below summarize the observations resulting from corpus data analysis. First, as far as VtoN conversion conditions are concerned (Table 17.2), the results are all in all in conformity with the hypotheses made in the previous section. Nouns massively refer to abstract entities (class -2-), although semantic derivations are sometimes observed: for instance, the process noun applique ([application N ]) converted from appliquer ([apply V ]) has a specialised meaning which leads this noun to refer to (concrete) entities, “whose function is to be fixed/mounted/hung (onto the wall)” namely wall lamps.

Table 17.1 NtoV conversion classes
Table 17.2 VtoN conversion classes

Concerning NtoV (Table 17.1), there are discrepancies between theoretical assumptions and corpus analysis results, which makes the definition of new classes. For instance, similarity verbs are not only met with respect to the agent (singer V), but also with respect to the theme, which is affected by the change-of-state transition process described by the verb: marbr(er)V [marble V ] (class -8-). The property acquired by the theme is a shape, a color, etc. described by the referent of the input noun. In addition to class -2-, grouping artefactual instruments/substance-based verbs, another set of rather similar verbs, has been collected in the so-called class -2′-: the input noun, referring to a part of the body (cil N [eyelash N ]) or a human characteristic (raison N [reason N ]) is namely used as an instrument (cill(er) V [blink V ], raisonn(er)V [reason V ]). Finally, a ‘default’, heterogeneous class has been drawn, grouping together Noun/Verb pairs in which N may refer to speech acts (laïus N vs laïuss(er) V [long winded speech N /expatiate V ]), to noise or sounds (vacarme N vs vacarmer V [uproar N /make an uproar V ], clic N vs cliquer V [click N/V ]), to concrete action results (sieste N vs siester V [nap N /have a nap V ], balafre N vs balafrer V [cut, slash, gash N/V ]). This class, labeled with -7-, is defined by means of a shallow link: V = “do/say/have N”. Within this class are also listed N/V pairs where V denotes delocutive acts: choucou N vs chouchouter V ([darling N /pet V ]), peste N vs pester V ([heavens! N /curse V ]). The ‘delocutive derivation’, originally introduced in Benveniste (1966) has been investigated in Cornulier (1976) and Anscombre (1979). Delocutive denominal verbs can be glossed by “To say « N » ”. Recently, an historical review of this notion has been described in Larcher (2003).

Moreover, a productive class has been isolated, namely that of borrowings (crash(er) V /crash N) and onomatopoeias (blablat(er) V /blabla N [waffle on V /waffle]). As nouns belonging to these N/V pairs denote concrete entities (sounds and (speech) acts), they have been included in class -7-.

In addition to both the initial linguistic assumptions and the newly discovered classes, Tables 17.1 and 17.2 also includes both new columns with quantitative results obtained from the dictionaries corpus analysis, and new cells, corresponding to the new discovered semantic classes just described.

Whereas VtoN conversion appears to be a stable WF process, leading to the formation of almost only abstract nouns, characterising NtoV types is a much less straightforward task. In fact, for this purpose, we have examined input N (formal, semantic, etymologic) features only. To refine this classification, a next step will be to compare these criteria with output verbs properties.

According to these (though perfectible) results, we can model the most frequent and seemingly productive conversion classes. With this choice, classes -2′-, -5- and -6- in Table 17.1, together with class -1- in Table 17.2, are excluded. Furthermore, we have chosen to disregard heterogeneous cases (i.e. classes -7- and -8- in Table 17.1) at the time being, the linguistic content of this set of nouns and verbs having in fact to be further examined; in particular, in Sect. 17.4.1.3, we come back to the reasons why Noun/Verb pairs which are members of class -8- are not accounted for in this chapter. Finally, the last excluded class is class -1-, Table 17.1, since NtoV orientation is in this group purely structure-driven. These decisions amount to design two alus, the former constraining and producing denominal converted verbs, the latter defining the basic structure of deverbal converted nouns. In Sect. 17.4, we shall see which of the input properties can be encoded within alus, which ones fall within the competence of the actual input, and how the ms-cs mechanism is able to build the right output representations, whatever the requested Noun/Verb class.

17.4 Modeling

As announced in Sect. 17.2, the formal representation we wish to obtain combines the following requirements: (1) ms-cs is taken as an input to output unification mechanism, (2), a unique NtoV unified [X N ] V alu records the linguistic constraints common to classes -2-, -3- and -4- in Table 17.1, while a unique VtoN [X V ] N alu does the same for the representation of class -2- in Table 17.2 (see Sect. 17.3.5). Behind the idea of accounting for regular, productive and frequently represented conversion classes, the goal is to predict the characteristics of the most likely Noun/Verb conversion producted neologisms.

17.4.1 Noun-to-Verb

Examining Table 17.1, Sect. 17.3.5, and excluding class -8-, three NtoV classes are very productive: class -2- (V = “do something using N”), class -3- (V = “(do what N would do|behave as N)”) and class -4- (V = “do or put something (with)in/during N”). As we shall see in the Sect. 17.4.1.1, all output verbs are based on a unique alu called [X N ] V . Section 17.4.1.2 focuses on some examples for each of the classes which has been taken into account.

17.4.1.1 [X N ] V Abstract Lexical Unit for Noun-to-Verb Conversion

The following alu in (Fig. 17.2) accounts for the way output verbs inherit properties from the appropriate input nouns:

Fig. 17.2
figure 2

[XN]V ALU

  • They inherit relevant argumental properties from their input noun, namely only those parameters which are used in input noun qualia roles \( \boxed2 \) and which are inherited by the verb. These parameters are encoded by \( \boxed{{{\text{a}}_{\text{i}}}} \) variables;

  • They inherit relevant aspectual and event structure parameters from their input noun, namely only those parameters which are used in the input noun qualia roles \( \boxed2 \) and which are inherited by the verb. These parameters are encoded by \( \boxed{{{\text{e}}_{\text{j}}}} \) variables;

  • They inherit only a part of the semantic content of their input noun, represented here by a part of the noun qualia. Mutual disjunctions (⊕) rule out overlapping between classes which have been accounted for:

    • if the input noun denotes an artefact (class -2-) or a location (class -4-), then the qualia of the output verb consists only in the telic value of the input qualia \( \boxed4 \),

    • if the input noun denotes a natural entity (class -3-), then the output verb inherits only the agentive value in the formal quale qs|form|ag of the input \( \boxed3 \), and as a consequence, the qualia of the output verb consists in this case in a formal role whose value is the conjunction of the predicate to_act_as_N and the qs|form|ag value, if any.

17.4.1.2 Some Examples

As we shall see with the examples below, [X N ] V combined with the appropriate input noun enables the representation, via ms-cs, of each sort of output verbs from the following NtoV conversion classes: class -3-, with imitation verbs like sing(er), class -2-, with instrumental verbs like drain(er), crayonn(er), dynamit(er) [dynamite V ] and class -4-, with locative verbs like usin(er) [manufacture V], coffr(er). Examples from each class are meant to illustrate various cases of inherited aspectual properties.

17.4.1.2.1 Imitation Verbs

As said before, class -3- imitation verbs are built from nouns which denote natural entities, e.g. singe. Let us see how the output verb sing(er) is produced from its nominal input singe (Fig. 17.3). First, we may notice that the input noun qualia structure indicates that singe is an animal \( \boxed5 \) bearing a prototypical behavior i.e. imitating a model \( \boxed6 \). This behavior is propagated through \( \boxed6 \) and via [X N ] V onto the constructed verb formal role. Therefore, the ms-cs-driven combination between [X N ] V and the lexical properties of the input noun entails the output verb to be provided with a qualia structure that can be paraphrased by: X sing(er) Y = “X acts as a N_monkey AND monkeys imitate Y”.

Fig. 17.3
figure 3

Conversion class -3-: singe N → conv sing(er)V

17.4.1.2.2 Instrumental Verbs

Prototypically, instrumental verbs are morphologically constructed from input nouns referring to artefacts. Let us consider for instance, the example of drain N /drain(er) V pair (Fig. 17.4). The output verb drain(er) inherits the telic role from its input noun drain, because of the artefactual nature of the noun referent. This telic value is a complex structure which is characterised by the qualia label transitition_lcp and which consequently contains the specific features for transitions. Since drain(er) describes an instrumental predicate, its meaning, carried through index \( \boxed4 \), can be expressed through the following gloss: X drain(er) Y = “X uses N_drain to extract Y from Z AND Y is extracted from Z”.

Fig. 17.4
figure 4

Conversion class -2-: drain N → conv drain(er)V

In addition, the inheritance of argumental and aspectual properties follows the general principles of noun and verb descriptions in gl. Except for the denoted entity \( \boxed1 \), input nouns argumental parameters are always encoded as default arguments (ms|as|d_argi), whereas they are inherited as true arguments in the output verb argumental structure (as|argi). In the same way, the default evenemential parameters in ms|es|d_ej are inherited as true parameters in output es|ej.

Following the lexical shadowing principle, the argumental parameter \( \boxed1 \) which encodes the entity denoted by the input noun is displayed as a shadow argument (s_arg0) in the output verb as.

The same mechanism is at play with verbs crayonn(er) and dynamit(er) excepted that here the telic role of the input noun crayon (resp. dynamite) encodes an activity (resp. an achievement). This telic value is reflected within the inherited qualia structure of each corresponding output verb: crayonn(er) denotes an activity whereas dynamit(er) describes an achievement.

17.4.1.2.3 Locative Verbs

As shown in Fig. 17.5, the fact that the input noun coffre refers to a place \( \boxed5 \) leads to the morphological formation of the locative verb coffr(er). The meaning of coffr(er) can be paraphrased by X coffr(er) Y = “X locks up Y in N_chest AND Y is locked up in N_chest”. This verb inherits the relevant part from the input noun qualia (i.e. \( \boxed4 \), its telic value), and those appropriate argumental and evenemential parameters which are linked within this inherited qualia part (the state \( \boxed9 \) and the process \( \boxed{10} \); the agent \( \boxed8 \), the patient \( \boxed7 \) and the location \( \boxed5 \)). The input noun telic value being of type transition_lcp, this label is propagated in order to characterize the output verb qualia structure.

Fig. 17.5
figure 5

Conversion class -4-: coffre N → conv coffr(er)V

The same is at play with the Noun/Verb pair usine N/usin(er) V excepted for the kind of event which is denoted in the telic value of the input noun usine. This value is of type activity_lcp, and it also characterises the qualia structure of the output verb.

17.4.1.3 Conclusion

This section was devoted to NtoV conversion wf process. We have seen that a unique alu called [X N ] V combined with the appropriate input noun through ms-cs schema is sufficient to build well-formed output verb meanings in a systematic way with respect to the ontological type of input nouns. Three kinds of output verbs are built in this way: imitation verbs like sing(er) from input nouns which denote natural entities; instrumental verbs like drain(er) from input nouns which denote artefacts; locative verbs like coffr(er) from input nouns denoting places or time intervals. As said above, these three kinds of output verbs correspond to three classes of Noun/Verb pairs, respectively class -3- (V = “(do what N would do|behave as N)”), class -2- (V = “do something to N theme using N”) and class -4- (V = “do or put N theme (with)in/during N”).

Let us come back to the reasons Table 17.1, class -8- has not been taken into account here. Observing this class, we may notice that a change of state is exerted by the output verb on its theme either with respect to the input noun itself (marbre Nconv marbr(er) V X Theme : “X Theme looks like marble”), or with respect to its shape (ballon N [balloon] →conv ballonn(er) V X Theme : “X Theme is round as a balloon”), or with respect to one of its parts (guêpe N [wasp] →conv guêp(er) V X Theme : “X Theme has a wasp waist”), or with respect to its function (frégate N [frigate] →conv frégat(er) V X Theme : “X Theme is such that its speed is that of a frigate”). In other words, the very meaning within the change of state affecting the referent of the verb theme, in Noun/Verb pairs belonging to this class, may be a function of one of the input noun qualia roles: e.g. const (guêp(er)) or telic (frégat(er)).

Given the evident complexity of these verbs, it seems clear that performing more subtle and discriminating representation of Table 17.1, class -8- verbs would provide us with very interesting results, and therefore deserves further investigation. However, we cannot address this question at the present time because several questions are not answered yet, among which the two of them:

  1. 1.

    No discriminating properties can be exhibited to constrain the membership of a given Noun/Verb pair to the Table 17.1, class -8-, because of the large range of input types: inputs may denote substances (marbre), artefacts (ballon), animals (guêpe), etc.;

  2. 2.

    No discriminating features can be defined to constrain the inheritance of input properties: output verb meaning can be obtained either from that of the whole entity referred to by the input noun, or from that of a related entity: e.g. the shape, some part, the function, etc. of the entity denoted by the input noun.

Answering these crucial issues is a mandatory precursor proposing a formal model for the semantic content of Table 17.1, class -8- output verbs from that of input nouns. A makeshift way to answer the first issue above would be to use some underspecified predicate such as V = “to_give_some_characteristics_of_N”, but such a controversial solution would not solve the second question. As a consequence, we prefer not to account for Noun/Verb pairs of Table 17.1, class -8- as long as points 1 and 2 remain unanswered issues.

17.4.2 Verb-to-Noun

As it emerges throughout the section devoted to linguistic descriptions, and according to the quantitative corpus-based values reported on the Table 17.2, Sect. 17.3.5, most deverbal converted nouns (i.e. those labelled by class -2-) describe either the verbal process or its result.Footnote 15 A third reading consists in a conceptual or propositional one.Footnote 16

These interpretations are all possible. Some nouns may realise all of them, for instance marche [march N /walk(ing) N]: (processive) la marche durera environ une heure [the walk/march will last one hour long], (result) la marche des Américaines a été un succès [The American women’s march has been a success], (concept) la marche est une discipline olympique [walking is an olympic sport]; some other nouns have only two interpretations. So chant [song N /singing N ] is only either resultative le chant des sirènes a ensorcelé Ulysse [The mermaid’s song bewitched Ulysses] or conceptual le chant est un art [singing is an art]. The ms-cs output lexical unit does not try to guess which of the readings is actually realised by the noun, it just provides nouns with the three possibilities.

17.4.2.1 V and N Minimal Required Features

Gathering the main properties accounted for by various authors (Corbin, Kerleroux, Fradin, Meinschäfer) and mentioned in Sects. 17.3.3, 17.3.4, and 17.3.5, we obtain the following list of minimal requirements the VtoN alu, noted [X V ] N , must satisfy in order to properly constrain the abstract semantic structure of deverbal converted nouns:

  1. 1.

    Its ms – collecting the characteristics all input candidate verbs must share – is as follows:

    • causative readings of input verbs being excluded, the verb qualia label should exclude any potential causative interpretation;

    • the event structure should not be that of a simple state;

    • the argument structure is unconstrained: actually, candidate verbs may or may not be transitive;

  2. 2.

    The description of the noun itself denotes an abstract entity with three possible readings: processive, resultative, or conceptual,

    • in its evenemential (processive or resultative) readings, the output noun inherits all the verbal aspectual and argumental properties, following (Meinschäfer 2003);

    • in its conceptual reading, the output noun refers a priori to a so-called proposition entity. So, this denotation, noted prop, must be part of the ouput qualia label.

Finally, in contrast to what happens for denominal verbs, described in Sect. 17.4.1, and to what Pustejovsky (1996) assumes, input verbs do not carry the output noun index as shadow argument (s_arg). We have to remember that Pustejovsky (1996) proposes the s_arg value to be instantiated for verbs as dance V or butter V by means of what could be considered as an incorporated noun: we agree with this assumption as far as butter is concerned: the entity is a logical part of the predicate, but not for dance, at least in French. Actually, as for any input verb of a VtoN conversion process, allowing for an s_arg value in dance as would amount allowing for a circular definition of V and N: the output noun would namely be, at the same time, both morphologically obtained from the verb, and semantically integrated in the verb definition.

17.4.2.2 [X V ] N Abstract Lexical Unit

Figure 17.6 below formalizes the set of constraints just recalled in the [X V ] N alu.

Fig. 17.6
figure 6

[X V ] N alu for VtoN conversion

For readibility sake, input verb as (resp. es) is directly coindexed through \( \boxed2 \) (resp. through \( \boxed3 \)) with alu as (resp. es). All the nominal arguments (excepted as|arg0 value) are inherited as default arguments, and value sharing would have required a slighly more complex representation.

In the alu qualia (qs), a new parameter w0 is used in the form value to ensure the existence of a conceptual interpretation of the expected output noun. The ms qualia value, characterising the potential input verb, is shared with that of the output noun qs|agent’s value through index \( \boxed7 \), as soon as this shared value meets the type constraint exerted on the input verb. Recall (Sect. 17.3.3) that, following (Meinschäfer 2003) this type constraint says that V should not have a causative reading, that is identified by the label \( \neg \)cause_lcp. The input verb type, represented by the lcp label, and indexed with \( \boxed6 \), becomes one component of the output noun qualia label. Given that none of the three potential interpretations of the output noun cooperates in any context, the lcp of this output noun is identified by an exocentric dotted type.Footnote 17 This complex type is composed of two simple types: the input verb type indexed with \( \boxed6 \) and the prop type. Therefore, deverbal converted nouns’ lcps are identified with: prop • \( \boxed6 \)_lcp, where \( \boxed6 \) stands for any aspectual type, but cause. Furthermore, \( \boxed3 \) indicates that the type of the first es event, i.e. e1, cannot be a state. Still according to Meinschäfer (2003), cf. Sect. 17.3.3, this second constraint filters out stative verbs. In other words, deverbal converted nouns denote basically abstract entities, and their agentive role (in fact, their origin) is the meaning of the verb they are converted from.

As for type accommodation between input verbs and the [X V ] N alu constraints, according to the usually adopted type hierarchy, given in (Fig. 17.7), \( \neg \)cause is equivalent to the entirety of the event subtypes, but cause. Now, cause being an accomplishment subtype, a \( \neg \)cause_lcp marked verb may express any non-causative accomplishments, activities or achievements.

Fig. 17.7
figure 7

Usual type hierarchy of eventualities

In the case of a candidate input verb which includes a causal reading and is of an exocentric dotted type, filtering out by unification \( \boxed6 \) the verb cause component amounts to keep only its non causative interpretation, by means of those qualia role predicates the remaining component type can access. And, consequently, only those evenemential and argument variables used in the accessed predicates are kept in the respective structures.

The whole word formation mechanism, made up with the [X V ] N alu, the candidate input verb X1, the ms-cs system and the nominal output noun [X1 V ] N is given in (Fig. 17.8).

Fig. 17.8
figure 8

ms-cs with VtoN conversion

  1. 1.

    the [X V ] N alu subsumes the common properties of all converted deverbal noun, by defining the minimal requirements on the expected input verb;

  2. 2.

    the potential input verb X1 has to unify with the alu ms, in order to activate ms-cs;

  3. 3.

    the actual deverbal noun [X1 V ] N results from ms-cs unification process, instanciating [X V ] N by means of appropriate X1 features. Examples of VtoN conversions involving non causative verbs or non causative verb readings, presented in Sect. 17.4.2.3, illustrate this mechanism.

17.4.2.3 Examples

In the following section we illustrate the various verbal lexical types which can unify with ms in [X V ] N alu. The section starts with non causative verb types, i.e. the case of an activity input verb (march(er) [walk V ]), followed by the case of a non-causative accomplishment (transport(er) [carry V ]). Then, the last two examples are meant to indicate how the mechanism works in order to deal with input verbs which bear a causative interpretation (recul(er), angoiss(er) [distress V ]).

17.4.2.3.1 Activities

In simple activity (non telic) qualia structures, such as in march(er) (or dans(er) [dance V ] or chant(er) [sing V ] …), only the formal role is defined, as shown on the lefthand side of Fig. 17.9. The successful ms-cs output deverbal noun appears on its righthand side. As indicated in Fig. 17.8, both input argument structure (ms|as) and event structure (ms|es) are inherited by the output noun, through, respectively, indices \( \boxed2 \) and \( \boxed3 \). As for the input qualia structure (index \( \boxed7 \) in Fig. 17.8, and \( \boxed4 \) in Fig. 17.9), it matches against the noun agentive qualia value, according to [X N ] V recommendations. [X N ] V also imposes to the output noun an exocentric dotted type labelled with activity• proposition_lcp. This type results (1) from the successful unification of march(er) lexical entry with the [X N ] V ms value (Fig. 17.8, index \( \boxed1 \)), which means: (a) event types compatibility (march(er) does not designate a state), and (b) lcp compatibility (activity is a case of \( \neg \)cause); and (2) from the successful verb lcp propagation into the nominal lcp labelling (Fig. 17.8, index \( \boxed6 \)). ms value \( \boxed1 \) is propagated onto the output noun structure, though this is not represented in (Fig. 17.9). As shown by its qs, the exocentric dotted typed noun marche holds two readings: the first refers to a process la marche des soldats sur la ville [The soldiers’ march on the city] or to its result: trois longues marches en forêt [three long walks in the forest] (depending on the agent y0 realisation) and the second to a concept la marche est une discipline olympique, [race walk is an Olympic sport] activating only the qs|form value.

Fig. 17.9
figure 9

VtoN conversion: march(er) V → conv marche N

17.4.2.3.2 Accomplishments

With transport(er), we intend to illustrate (Fig. 17.10) a case of non-causative accomplishment. The input verb’s event structure, headed by the process, is propagated into the output noun \( \boxed3 \), together with its argument structure \( \boxed2 \). The ms-cs unification principle works in the same way as for marche, and gives raise to the definition of an exocentric dotted typed qualia structure composed with two mutually exclusive types: (prop) un transport nécessite toujours un transporteur [transports always require conveyors], and accomplishment, with type activity• state. Accomplishments can be realised, as nominal lexical units, either through the qs|ag|ag activity value: le transport, lundi prochain, de la marchandise par le premier convoi [Goods conveying, next Monday, by the first train], or through the qs|ag|form resultative (state) value: Tous les transports sont annulés jusqu’à lundi prochain Footnote 18 [all transports are cancelled until next Monday].

Fig. 17.10
figure 10

VtoN conversion: transport(er) V conv transport N

17.4.2.3.3 Causatives

Let us now turn to more complex verb types or so it seems. The verbs recul(er) and angoiss(er) illustrate the case of causative predicates, that are mainly movement or psychological predicates. In Max recule la chaise [Max moves back the chair], the agent Max voluntarily causes the chair movement, and in Le film a angoissé Max [The movie distressed Max], the movie content entails Max psychological state of anxiety. These verbs generally carry a second resulting and intransitive reading. The subsequent movement for verb types like recul(er): Les ennemis reculent [Enemies are going back], and the caused state for verb types like angoiss(er): Max angoisse [Max is worried sick]. As J. Meinshäfer pointed out, only the non causative reading is an available candidate input for VtoN conversion: *le recul de la chaise par Max [Max’s moving back of the chair], versus le recul de Max [Max’s backward movement], *l’angoisse de Max par le film [The movie distress of Max], versus l’angoisse de Max [Max’s distress].

From a formal point of view, these distinct, and non-overlaping verb interpretations are represented by exocentric dotted typed structures. As illustrated by (Fig. 17.11) recul(er) and by (Fig. 17.12) angoiss(er), the activation of causative readings (agentive role) and that of resultative readings (formal role) are therefore mutually exclusive.

Fig. 17.11
figure 11

Lexical entry of recul(er) V

Fig. 17.12
figure 12

Lexical entry of angoiss(er) V

Unifying recul(er), as illustrated above, with the [X N ] V alu ms (see Fig. 17.6) is above all in this case a matter of qualia types unification. In fact, the cause component within the cause• activity_lcp labelled verbal exocentric dotted type is neutralised through unification with the \( \neg \)cause_lcp required input verb. As this component is filtered out, so are the corresponding qualia roles, together with their event and argument parameters. The unification effect is that of selecting only the verb resultative reading. The same occurs for angoiss(er): through unification with alu ms, the causative interpretation is rejected, whereas the resultative static predicate is kept as the actual VtoN conversion input.

Once the correct reading has been selected, the remaining wf mechanism works on in a straighforward way: (1) the appropriate qualia label fulfills the missing slot on the output noun lcp (providing thus recul with proposition• activity_lcp, and angoisse with proposition• state_lcp), (2) the qualia structure defining the resultative predicate is inherited by the noun agentive value, while its formal value is the prop typed index w0. Output nouns recul [backward movement N /retreat N ] and angoisse [distress N ] are displayed respectively in Figs. 17.13 and 17.14. It can be noticed that (1) recul may denote an agentive intransitive movement process (le recul de l’armée [The army’s retreat]), or the movement result (les reculs sont inévitables [Backward movements are unavoidable]), or a concept (le recul s’oppose à l’avancée [Backward movements are opposed to advancements]); (2) similarly, angoisse is a static nominal, that may or not affect an experiencer (l’angoisse (de Max) a été provoquée par un stress [(Max’s) distress has been caused by stress]) or depict a concept (l’angoisse est étudiée en psychanalyse [Anguish is studied in psychology]).

Fig. 17.13
figure 13

VtoN conversion output recul N

Fig. 17.14
figure 14

VtoN conversion output angoisse N

17.5 Conclusion and Perspectives

In this paper, we have described a gl-based model designed for Word Formation. This model includes a composition schema called ms-cs and several abstract lexical units, each of which simulating a Word Formation process. This device has first been used to represent noun to verb é-prefixation in French (Jacquey and Namer to appear; Namer and Jacquey 2003). The robustness of the chosen approach has been confirmed when applied in the conversion framework presented here. The success of this approach is due to the fact that it combines linguistic hypotheses from a well-established morphological theory (inspired from D. Corbin work), as well as a lexical semantics formalism, namely gl (Generative Lexicon).

Moreover, coupling Word Formation theory with lexical semantics, through this method, has two additional effects:

  • It makes obvious differences between seemingly identical phenomena. This paper has illustrated the structure distinctions for verbs such as walk and dance, or drain and butter, whereas they were analyzed in the same way in Pustejovsky (1996),Footnote 19 although they belong to opposite Word Formation families according to the morphological theory we rely on.

  • It draws out similarities concealed behind apparent differences. Hence this paper has shown that verbs drain(er) and sing(er) result from a single WF rule, via [X N ] V alu. Each time, only one mechanism is at play, their corresponding input nouns being responsible for the differences in verbal meanings.

Both similarities and differences are detected and analyzed within morphological theory; gl collects, ranks and formally expresses all of these linguistic hypotheses. In addition to this new collaboration between these two linguistic fields, the model also seems to provide new future prospects in Natural Language Processing. In fact, formalizing both NtoV and VtoN conversion, on the basis of a corpus analysis, can be viewed as an empirical checking of linguistic predictions about neologisms. In this regard, this experiment has confirmed the productivity of verb-to-noun (VtoN) conversion leading to processive nouns, or that of noun-to-verb (NtoV) conversion process leading to instrumental, locative or stereotypical agentive verbs. On the other hand, it has also allowed to detect the emergence of new, quantitatively important classes: nominal verbs denoting a change-of-state (marbr(er), guêp(er)), borrowings and onomatopoeias Noun/Verb pairs denoting (speech) acts or sounds (patch/patch(er), blabla/blablat(er), glouglou/glouglout(er)).

Identifying the most creative conversion types, predicting their semantic constraints exerted on both input and output, and drawing their input-to-output semantic relationships, through the choice of the right conversion orientation are results which could be used further in nlp systems in order to enrich lexical contents.