Keywords

1 Introduction

A type is a set of values. When we write a syntactic type, say NP, we mean a set of expressions (values) which can substitute for that type. This type serves to distinguish some expressions from for example the set of expressions that can substitute for a VP type.

The distinction is crucial for solving the correspondence problem in syntax-semantics. For this purpose we talk about semantic types, for example e for things and t for propositions. The concepts that can substitute for semantic types are not expressions in the sense that syntactic expressions are, because they are not observable, but they leverage a theory to hypothesize about the kind of semantics that these types stand for.

These two species of types are then put in a correspondence in a theory of syntax-semantics connection. The understanding is that if one substitutes a certain expression for a syntactic type, then its corresponding semantic type substitutes for a certain kind of semantic value. We know less about the semantic values; but, at the level of the correspondence problem, this is not very critical. It is however crucial to make the distinctions and propagate them in a parsing mechanism rather than solving all type-interpretation problems in one go.

We need a theory which provides explicit vocabulary and mechanism for the correspondence, to be more specific about the equal relevance of substitution for subexpressions which purportedly do not contribute to the meaning of the expression.

In the categorial grammar parlance, for which we will use Combinatory Categorial Grammar [29, 30], hereafter CCG, we can exemplify the correspondence as follows, where we use the “result-first argument-next” notation:

figure a

Some syntactic types are further narrowed down by features, such as above for third person singular NP, which are, in CCG, not re-entrant.

We argue in the paper that in a radically lexicalized theory which adheres to transparency of derivations by type substitution (rather than lexical insertion), such as CCG, there are built-in degrees of freedom to support Multi-word Expressions (MWEs) and idioms without complicating the mechanism.

Paracompositionality is key to projection of their properties in a derivation. It is the idea that, in addition to the compositionality of the lexical correspondence, which is compositional partly because it relies on non-vacuous abstractions, type substitution by (i) what we call singleton types and (ii) what is called head-dependencies in the NLP literature is also compositional because it spells non-vacuous abstraction as part of the correspondence, but as something related to the contingency of the predicate, rather than the argument structure of the predicate. In a radically lexicalized grammar both sources are available in a lexical item. These types are paracompositional also in the sense that whether we have an idiom reading or compositional one is already decided by the category of the head in the derivational process.

The term contingency is used here in the sense of Moens and Steedman [23] where it relates to extension of happenings. In the case of events (culminations, points, processes and culminating processes), which have definite extension, it is an event modality of space, time and manner; and, in the case of states where extension is indefinite (e.g. understand) it is some property of the state. From now on when we use the term ‘contingency’ we mean something related to extension of the predicate, rather than who does what to whom in the predicate.

MWEs are expressions involving more than one word in which the properties of the expression are not determined by the composition of the properties of the constituent words, which would be the case for phrases. There is a tendency to treat them as single lexical units [10, 33]; but, as we shall see, CCG does not require the single unit to be the phonological representation to the left of ‘:=’ in the format of (1). This property of CCG naturally extends to coverage of verb-particle constructions e.g. look the word up as discontiguous MWEs headed by a lexical item.

Phrasal idioms and idiomatically combining phrases are classes identified by Nunberg, Sag and Wasow [24] to account for systematic variation in syntactic productivity of idioms. Typewise they will relate to singleton types (phrasal idioms) and head-word subcategorization (idiomatically combining phrases) in our formulation.

As a preview of the article, we can think of the meaning distinctions as ranging from “beans” i.e. the nounphrase beans itself as a category (this is what we call the singleton type); to as the category of an NP headed by the word beans, which has wider range of substitution; and, to the polyvalent NP with the widest substitution for that type. This much is categorial grammar with type substitution. CCG as an empirical theory adds to this the claim that there is an asymmetry in the range of substitutions: the singleton types can be arguments only, and arguments of arguments and results, but never the result. We shall see that this has implications for the linguist’s choice of handling syntactic productivity in a grammar.

Some implications follow: Because of paracompositionality, all expressions requiring a singleton type would involve the semantic type of a predicate, and all idiomatically combining phrases requiring a different interpretation than the compositional one would have the same consequence independent of their syntactic productivity. In short, every idiom must contain a predicate (but not necessarily a verb). We cover these implications in the article.

2 Substitution in a Derivation

In (1a), the can be substituted for by certain kinds of expressions, for example John, me, the ball, a stone in the corner, etc. Its corresponding semantic counterpart in the logical form (LF), written after the colon, has the placeholder x which can be typed as e, to be suitably substituted for by a semantic value described above. The can be substituted by narrower expressions, for example eliminating I, you. Because this is an indirect correspondence, its semantic counterpart y can have the same type e.

The tacit assumption of indirectness is sometimes made explicit, for example in Bach’s [2] rule-by-rule hypothesis: The derivational process operates with syntactic types only, and when it applies the semantics of the rule, its semantics works only with LF objects. Quoting from Bach: “Neither type of rule has access to the representations of the other type except at the point where a translation rule corresponding to a given syntactic rule is applied.” The “syntactic rule” in a lexicalized grammar such as CCG is the combinatory syntactic type of a lexical correspondence. The “translation rule” is the lexically-specified logical form, LF, as in (1).

The derivational process reveals partially derived types, for example

figure f

for (1a), if function application substitutes say a stone for , with some semantic value . The semantic type of such derived categories is concomitantly functional, e.g. \(e\mapsto t\) for this syntactic type. John hits is \(e\mapsto t\) too, with category .

We can see the relevance of derived types to substitutability in a closer look at (1b). If function application substitutes for the in the example, the derived category would be

figure k

in this case. This is also an \(e\mapsto t\) type semantically. However, its syntax is narrower so that we can account for the expressions in (2).Footnote 1

figure u

The derivational process works as below, with distinct from .

figure x

Here function application is shown in forward form and backward form . Derivation proceeds from top to bottom in display, as standard in CCG; i.e., bottom-up as far as parsing is concerned, and one at a time. For brevity alternative derivations using function composition are not shown; their implications for constituency are discussed in Steedman references. We also eschew the slash modalities of Baldridge and Kruijff [3] to avoid digression, which can further restrict the combination possibilities of syntactic types. They are mentioned later when they are relevant to discussion. The LF contains a structured form, viz. the predicate-argument structure, which is written in linear notation for simplicity; for example is same as ; i.e., it is left-associative.

In preparation for final discussion of substitution (§6) in relation to the wrapping operation, we can redraw this derivation by showing the substituting expressions as we proceed, which we do in Fig. 1.

Fig. 1.
figure 1

Substitution of syntactic expressions for syntactic types. Boxes show segments combined. We display some one-at-a-time derivations on the same line to save space.

MWEs present a challenge for substitution in such correspondences. In Schuler and Joshi’s [28]:25 words: “In the pick .. up example, there is no coherent meaning for \( Up \) such that \(\llbracket pick \ X \ Up \rrbracket = Pick (\llbracket X\rrbracket , Up )\).” They go on to show how tree-write in the form of TAG transformations, rather than string-rewrite of CFG transformations such as [27], can deliver different meanings of such expressions after a fully compositional tree is established for ‘pick’, ‘..’ and ‘up’.

In such systems, post-processing and reanalysis of a categorial surface derivation are possible, both for TAG and HPSG,Footnote 2 therefore these transformations are possible, indeed useful, to simplify large-scale grammar development.

For radically lexicalized grammars such as CCG where such options are not available, three paths to maintaining compositionality in the presence of “non-compositional” and/or idiomatic parts seem to be available:

figure ac

The problem is exacerbated by phrasal idioms which seem to have partially active syntax in some non-compositional parts, for example kick the (proverbial/old) bucket, but note \(\sharp \)the bucket that John kicked, \(\sharp \)kick the great bucket in the sky, and *the breeze was shot. (\(\sharp \) is used to indicate unavailability of idiomatic reading. The last two examples and judgments are from [27].) However, there are also phrasal idioms which are syntactically quite active, e.g. the beans that John spilled, and spilling the musical/artistic/juicy beans.

Option (4a) does not always necessitate post-processing of MWEs in CCG, but, as we shall see later in (23), it does not guarantee locality of derivations either. One way to realize it is the following:

figure ad

This approach to phrasal idioms which is similar to meaning postulates for the same task such as [25] would then have to make sure that the head meaning has some predefined cluster of modifiers such as proverbial or old, but not much else, for example \(\sharp \)kick the bucket that overflowed. It would also have to overextend itself to avoid the idiomatic reading in \(\sharp \)the bucket that you kicked.

As an alternative, the type below is inspired by trainable stochastic CFGs which can distinguish argument PPs from adjunct PPs by encoding head dependencies for CFG rules, for example \(\hbox {VP}_{\mathrm {put}} \rightarrow \) \(\hbox {V}_{\mathrm {put}}~\hbox {NP}~\hbox {PP}_{\mathrm {on}}\): (We shall fix the unaccounted vacuous abstraction in it later in the paper.)

figure ag

It might appear to be LF-motivated just like (5) above; but, it is actually a case of (4c/i). , meaning NP headed by bucket, can be made distinct from because different surface expressions can be substituted for them. (6) overgenerates for the examples given above, but it might be the right degree of freedom to exploit in the syntax-semantics correspondence of idiomatically combining MWEs such as for spill the beans.

In the remainder of the paper, we show that option (4c/ii) has been implicit in CCG theory all along but never used, in the form of syntactic types for which only one value can substitute (Sect. 3). We call them singleton types. This way of lexical categorization and subcategorization predicts very limited syntax, but not as metalinguistic marking that [27] proposed for kick the proverbial/old bucket. It is due to having to enumerate different senses and contingencies of phrasal idioms (e.g. proverb bucket for senses above, also covering e.g. when I face the proverbial bucket), and pick up for MWEs. In Sect. 4 we show that idiomatically combining phrases have principled distinctions from singleton types. Head-word subcategorization such as (6) is the more promising option for them, which radically lexicalized grammars can handle without extension. There are also idioms which require analysis combining both options such as those with semantic reflexives where the referent is not part of the idiom, e.g. I twiddle my/*his thumbs. Section 5 covers these cases.

These findings reveal some aspects of type substitution and its projection when the expressions are not fully compositional at the level of the predicate-argument structure. As such they may have implications beyond CCG.

Finally we show that adopting option (4b) to analyze for example pick \(\cdots \) up as pick up \((\cdots )_{\mathrm {wrap}}\) overgenerates in the combinatory version of wrap (Sect. 6), and complicates the grammar with a domino-effect in the surface version of wrap; therefore, it would do more damage than good if adopted for (discontiguous) MWEs and phrasal idioms. CCG can continue to avoid all forms of wrap in the presence of all kinds of MWEs and phrasal idioms.

3 Singleton Types

A brief preview of the proposal for (4c/ii) is as follows. A singleton syntactic type self-represents because it can substitute for one value only. We designate such types with strings, such as or ; for example:

figure am

( is a function that yields a culminating state in the sense of [23].)

We call categories in (7) ‘paracompositional’ to highlight the fact that, although their LF correspondence is intact so that the derivational process is transparent, they might have seemingly vacuous abstraction from the perspective of the predicate-argument structure, symbolized by the placeholders x above.Footnote 3

However, one can make a case that this abstraction, corresponding respectively to singleton categories and , might have a role inside the LF constants shown in primes, as contingencies. We write them for example as (as ceremonial death, reported death, etc.), rather than . These LF ‘constants’ are convenient generalizations in CCG standing in for a plethora of features anyway, so it seems natural to think of them as having their own abstraction. (The semantic types corresponding to these contingencies are then \(\alpha \mapsto t\) for some \(\alpha \).)

It will be seen in Sect. 3.2 that the examples in (7) differ in their sense from picked up the book and kicked the blue bucket, therefore a separate grammar entry is empirically justified. The sense distinction is reflected explicitly in the LF, as we shall see later. Both possibilities for substitution, for the syntactic type and for its placeholder in the LF, are principally restricted by CCG.

Singletons also engender a way for such entries to be morphologically more transparent, for example by being susceptible to inflection, e.g. picking, by providing a segmental alternative to contiguous but MWE pick up \(\cdots \), which would need a morphological pointer for inflection, as noted by [27, 33] for their analyses. Nunberg, Sag and Wasow’s [24] dichotomy between phrasal idioms and idiomatically combining items also vanishes, because of the singleton types and head-word subcategorized argument types. The distinction between syntactically pseudo-active kick the bucket and more active spill the beans naturally follows from whether the idiomatic part has a role in the predicate-argument structure, which we capture by systematically choosing between option (4c/i) and (4c/ii) per lexical correspondence.

3.1 Parsing and Correspondence with Singleton Types

The crucial property of a category in a lexical correspondence such as

figure at

with singleton , is that the string “s” as a category does have its own correspondence. This cannot be a literal match without categorial processing of the surface string, with to the right of \(\alpha \). It is a compositional derivational process arising from (a) below, to lead to (b). The lexically specifiable difference from a polyvalent category such as is that the item \(\alpha \) subcategorizes for the string , hence treat it as a category, rather than subcategorize for the category of , viz. in the example. To obtain , the derivational process works as usual for s, independent of the item \(\alpha \). We shall see in (9) that rules of function application need no amendment for this interpretation. (8b) is lexically determined by \(\alpha \).

figure bb

Same idea applies to backward application, for

figure bc

and the sequence \(s\alpha \).

In other words, the surface string s is derived by the derivational process as well. It is just that the item \(\alpha \) carrying the singleton type as an argument decides what to do with its semantics, which we indicated schematically above as modal contribution to contingency of p, as \(p_{x}\) of \(\alpha \). This is not post-processing of a category in a radically lexicalized grammar, in which all and only head functors decide what to do with the semantics of their arguments.

It means that, whether an argument type is polyvalent or singleton, there has to be an LF placeholder for it, otherwise the derivational process, which is completely driven by syntactic types in CCG, cannot proceed. It can be seen in the basic primitive of CCG, viz. function application:

figure bd

The LF of the functor, f, has to be a lambda abstraction, to be able to take any and yield fa. This is true of singleton and too.

We can clearly see the role of substitution rather than insertion in projection of types. The rule above is in fact realized as below (similarly for others):

figure bi

There is no sense in which we can insert something into \(\alpha \) and \(\beta \) as they form \(\alpha \beta \) because these are surface expressions.

The singleton types present an asymmetry in argument-result (or domain-range) specification. Functors such as and have domain and range , and, apart from trivial identities where and are the same singleton, the interpretation where the range itself is a singleton is problematic. Since is a function into for some slash , if it is not a trivial case of singleton identity, say , it is difficult to see how can be singleton. Although there are no formal reasons to avoid singleton results, and results of results, we conjecture that singletons are arguments, and arguments of results and arguments, because there seems to be no nontrivial function of a singleton-result with grammatical significance.

A related argument can be made about a singleton’s potential to be the overall syntactic category of a lexical item. The notion of extending the phonological range of an item such as (a) below coincides naturally with “words with spaces” idea (e.g. ad hoc, by and large, every which way), which is common in NLP of MWEs, but (b) is also an option.

figure bv

Notice that (b) is different than having for lexically specified verbal adjunction in the manner of [13], which, given (8), must either use entries similar to (11), or derive every which way syntactically, and choose to trump its category because it wants a narrower LF due to singleton subcategorization. However we think that both options may be redundant, because of the following.

In CCG the head functor decides the semantics of its entry even if it subcategorizes for a singleton category. Therefore the entries in (a–b) above which we use in (a–b) below may be redundant if the words in “words with spaces” are part of the grammar, and if they can combine in any way, say as in (c) below for some :

figure by

There would be no post-processing or reanalysis in these cases; they would be multiple analyses because of redundancy. The transparency of derivation requires that in configurations like (8b) the constituents of the rule applying can themselves be derived.

The rules that allow CCG to rise above function application in projection, composition and substitution also maintain the transparency of the syntactic process, by being oblivious to the nature of argument types in these rules:Footnote 4

figure bz

If the result categories are not singletons, as we argued, then the rules above never face a case where is a singleton. This means that, since singletons are arguments, meaning they bear a slash, say for some slash in , the slash is inherently application-only, equivalently

figure ce

in [3] terminology.Footnote 5

This is corroborated by examples like below where there is no idiomatic reading: (We show the derivation for the hypothetical case where singletons would be allowed to compose. Typing the singleton as

figure cg

eliminates the derivation. The slashes in the paper are harmonic or unless stated otherwise.)

figure cj

For polyvalent types, one-to-one correspondence of syntactic types and placeholder types is meant to capture the thematic structure in CCG, for example for the door opened versus someone opened the door, by having two different (albeit related) correspondences for open.

For a singleton, its functor (and there must be one, since they can only be arguments) decides lexically whether there is a predicate-argument structural role for the placeholder in the LF, as we see in the distinction of spill the beans, where is an argument of , versus kick the bucket, where or anything related to it is not an argument of .

Therefore, for CCG, MWEs and phrasal idioms are not exceptions that need non-transparent derivation, apart from lexical specification as something special. They are consequences of the nature of categories and radical lexicalization.

Also because of the properties described in this section, a string as a category cannot be empty, which would violate CCG’s principle of adjacency and principle of transparency (see Steedman references). No rule in (9) or (13) can apply if one of the categories is empty. Therefore the surface string itself for the singleton (s in example (8)) cannot be empty either.

Having explored the possibilities for the singleton types in combinatory categories, we look at their use.

3.2 Verb-Particles and Phrasal Idioms with Singleton Types

In verb-particle constructions, the differences in the syntax-semantics correspondence force the following lexical distinctions. We now write the categories in more detail than in the preview.

figure co

The features above are all finite-state computable, just like morphological ones, as phonological weight (\(\mp \)heavy) and lexical content (\(\mp \)lexc) in an expression substituting for a category. All CCG category features can be interpreted this way, because combinators do all the syntactic work.

The reason for having two different grammar entries (a–b) for pick up follows from the fact that they are not equally substitutable, for example as an answer to What did you do?

(15b) leads to achievement, and (15a) to culmination. Both cases also differ from (c), which provides wider substitution for , and with a different meaning. We treat (a–c) distinctions surface-compositionally, which are transparently projected without wrap:

figure cq

where at the end of the derivation can interpret its event modality (contingency) compositionally, since it is a closed lambda term.

Notice that the word up knows nothing about the verb-particle construction. Its category is for a PP head, say , as a predicate modifier. It is the verb that delivers the distinct meaning. Its subcategorization is for a singleton, which eschews the syntactic category of the word up but not its phonology and semantics, as described in (8b).

(15b) can be assumed to arise from the syntactic category by finite inflection. CCG has options here, to accommodate morphology without having to have a “morphological insertion point” in a contiguous but MWE entry , to avoid ?pick upped.Footnote 6 This is made possible by singleton types.

Examples (15a–b) use a degree of freedom which is relevant to phrasal idioms. The singleton syntactic type “up” corresponding to the LF placeholder x maintains the compositionality of the correspondence; but, it may have no contribution to the predicate-argument structure at all in some cases, which would make it paracompositional, because its semantic type is a closed lambda term as far as predicate-argument structure is concerned. Notice that in (8b), is not in the predicate-argument structure of p; it is a contingency of p.

Consider the following examples in this regard, where x for as an event modality might mean ‘ceremonial death’, ‘reported death’, etc.:

figure cx

They anticipate very limited syntax in the semantically paracompositional part in the idiom reading (a) because of having to enumerate them (kick the old/proverbial bucket vs kick the bucket that John thought overflowed).Footnote 7 These assumptions cannot give rise to the idiom reading in the bucket that you kicked, with no further stipulation than singleton categories in a lexical entry (cf. a–b; ‘*’ on the right of a derivation means it is not possible):

figure cz

Given the polyvalent argument category of the relative pronoun, we can see that relativization out of phrasal idioms would not be possible even if we allowed composition of singleton types, therefore the syntactic productivity of idiomatically combining phrases arises from their use of head-dependencies rather than singletons, as we shall soon see in derivations similar to (b), in (26).

We note that carrying the head-word in a polyvalent category to have the same effect, for example , overgenerates the idiom reading, because the bucket that John thought overflowed can substitute for .

The direct approach to categories that we see in radically lexicalized grammars, whether they are polyvalently substitutable or not, contrasts with systems of rewrite and/or record keeping in which post-processing is possible. For example there is no reanalysis or post-processing mechanism needed to eliminate the idiomatic reading below:

figure dc

We can then follow [31] in assuming that passive is a polyvalent lexical process headed by the passive morpheme, mapping for example to , which eliminates passivization *the breeze was shot from the entry:

figure df

Idioms such as at any rate, beside the point further demonstrate that all idioms needing restricted types must contain a predicative element in the domain of locality of their head because we are required by paracompositionality to record the special reading and contingency, for example as extension of discursive clarification (a) and comparison (b):

figure dg

4 Head-Word Subcategorization and Idioms

The difference between idiomatically combining phrases and phrasal idioms such as kicking the bucket is clear: The syntactically active ones are active because the idiomatic part has a role in the predicate-argument structure. ‘Secret’ is an argument of ‘divulge’, whereas ‘bucket’ is not an argument of ’die’. For example, spill the beans seems to require categorization such as (a) below in the manner of (6), rather than (b) fashioned from (5) or singleton-subcategorizing (c). Cf. also the non-idiomatic spill in (d). Tense morphology renders finite versions of below as , eg. for (a).

figure dk

is a predicative phrase type, which includes the quantifier phrase. The syntactic type of the idiomatic argument in (a) encodes the head-dependency from surface structure. It avoids the idiomatic reading in to spill the bean, which (b) may not. (b)-style solutions would depend on LF objects, which may not always reflect surface forms in full. In fact (b) requires post-processing to eliminate the idiom reading in the following example:

figure dl

This is still the case if we treat the construction as multi-headed, as [15]:238 do, by also assuming , and changing the LF choice condition of spill to ‘if head(x)= then else ’. does not refer to this entry.

The process of marking head-word dependencies requires statistical learning, as the category such as in (22a) implies. It has been known in TAG systems with supertags since [6] that disambiguating such categories is feasible with training. The earliest approach to such marking in CCG is [8, 9] as far as we know, where probabilistic CCGs are similarly trained. Later work such as [1] shows further progress in disambiguation of head-dependencies.

\(NP_{\mathrm{beans}}\) is a polyvalent type, not a singleton. Therefore we get the following accounted for by (22a) (some of the examples are from [33]):

figure ds

Head-marking of an argument category by the idiom’s head is required because of examples such as below, where an idiomatic reading is eliminated despite relatively free syntax because the coordinands would not be like-typed:

figure dt

Right-node raising succeeds when non-idiomatic entries such as (22d) do not subcategorize for head-word marked arguments. (25) is unproblematic with it.

When the head of the construction does not require identical types as does the conjunction above, head-projection works with simple term match; cf. the one for kicking the bucket in (18a) (h is for head-word feature):

figure du

The example also shows that argument types of idiomatically combining phrases must be composable; therefore; (22c) is inadequate.Footnote 8

5 Idioms Requiring a Combined Approach

There seems to be cases where a combination of singletons and head-marked polyvalent subcategorization is needed. The give creeps construction, which is sometimes considered not an idiom because of its compositionality [19], is paracompositional in our sense, and idiomatically combining in [24] terminology, because although creeps seems to be an event modality of rather than its argument, is an argument. A simple head-marking approach such as would overgenerate in cases such as \(\sharp \)give me some creeps, but we have give me the absolute/shivering/full-on creeps. Notice also that the construction and related items resist dative shift (judgments are from [20]; ‘*’ seems to be equivalent to ‘\(\sharp \)’ in our terms):

figure dz

Richards [26] observes that (a) below can be the unaccusative of give; and, (b) is widely attested in the web (but recall \(\sharp \)give me some creeps).

figure ea

Assuming that dative shift is polyvalent, following [31], in the form of lexical mapping from to , we can eliminate it for the type in (c), which we think captures the insight of Richards, and permits adjunction within an N, e.g. mountains of creeps.

Another class of idioms forces a combined approach as well. Semantic reflexives in I twiddled my thumbs/ate my words/racked my brain/lose my mind are not morphological reflexives and they are inherently possessive, for example:

figure ed

The LF captures the properties that the subject idles on his own time, the lexical possessive in the LF of x which is presumably lexically is inalienable and belongs to the subject. This is a reflexive in the sense that it must be bound in its local domain determined by . The referent (z) is available in one domain of locality in a radically lexicalized grammar because the head of the idiom does not require a VP in phrase-structure sense but a clause. Agreement is locally available too; by insisting on same agreement features. The head-dependency is that the argument does not contain lexical material, leaving out examples such as John twiddled John’s thumbs as an idiom.

6 No Wrap

We have seen that options (4c/i) and (4c/ii) are not mutually exclusive. We also suggested that singleton type is a forced move to avoid loss of meaning composition. One consequence of this is the treatment of verb-particles without wrap, which are not related to idioms although they are MWEs. We now consider option (4b) in more detail from this perspective, which at first sight seems to be just as lexical as the two alternatives we have considered so far.

The projection principle of CCG, which says that lexical specification of directionality and order of combination cannot be overridden during derivations, eliminates (30) from projection because it has the second-combining argument of a function applying before its first-combining argument , an operation of the general class that has been proposed in other categorial approaches under the name of “wrap.”

figure ei

Wrap of the kind in (30) has a combinatory equivalent, namely Curry’s combinator  (see [11]). CCG’s adjacency principle eliminates this combinator on empirical grounds, rather than formal, as a freely operating rule. Adding (30) to CCG’s projection has the effect of treating VSO and VOS as both grammatical, which is not the case for Welsh, and to carry the same meaning, which is not the case for Tagalog although both VSO and VOS are fine. These properties must be part of a lexicalized grammar rather than syntactic projection.

The version of wrap which [2, 12, 16] employ is different, which was eliminated from consideration so far because it is non-combinatory; and, it violates adjacency of functors and arguments. That wrap is the following:

figure ek

where first() function gives the first element in a list of surface expressions for Bach [2], or first word for Dowty [12]; and, rest() returns the rest of the expression. The wrapping slash

figure el

of Jacobson [16] does the infixation of \(s_2\).

Semantically, it is function application. Syntactically, no combinator can do what this rule does to its input expressions, which is to rip apart one surface expression (\(s_1\)) and insert into it. It differs from  , which wraps one independent expression in two independent expressions.

The appeal of surface wrap to MWEs was to be able to write a category for pick \(\cdots \) up as for example

figure en

cf. (16).

Syntactic wraps such as above, whether combinatory or non-combinatory, have domino effects on dependency and constituency, unlike ‘lexical wrap’, where a lexical entry specifies its correspondence; for example, for the strictly VSO Welsh verb ; note the LF.

An example of global complications in grammar caused by wrap can be seen below, where dashed boxes denote wrapped-in material; cf. Figure 1.

figure ep

Derivation (a) is Bach’s use of non-combinatory wrap rule in (31). Given these categories which involve wrap, there is one interpretation for (b), where the adverb can only modify persuade. With the unwrapped version of persuade in (c), two interpretations are possible: one modifies the VP complement of persuade, and the other, persuade John, both of which are required for adequacy.

7 Conclusion

One point of departure of CCG from other categorial grammars and from tree-rewrite systems is that (i) we can complicate the basic vocabulary of the theory, but (ii) not its basic mechanism such as introducing wrap, if a better explanation can be achieved. The first point has been made by Chomsky repeatedly since [7]:68. Singleton types could be viewed as one way of doing that, much like vs. distinction. We have argued that it is actually not a complication at all in CCG’s case, because the possibility has been available, in the notion of type as a set of values, which can be a singleton set. CCG differs from Chomskyan notion of category substitution by eliminating move, empty categories and lexical insertion altogether, which means that all computation is local, type-driven, and there is no action-at-a-distance, to address the second point. The expressions substituting for these types are then locally available in the course of a derivation. This seems critical for MWEs.

The possibility of a singleton value is built-in to any type. The asymmetry of CCG’s singletons’ categorization, that they can be arguments, and arguments of arguments and results, and, their inherent applicative nature, deliver MWEs and phrasal idioms as natural consequences rather than stipulation or a “pain in the neck for NLP.” Syntactically active idioms are not singleton-typed because they have relevance to predicate-argument structure; and, their narrower syntax, compared to free syntax, seems to necessitate head-marking of some argument categories, which is known to be probabilistically learnable.

Some implications of our analyses are that all idioms can be made compositional at the level of a lexical correspondence without losing semantic distinctions, and without meaning postulates or reanalysis. Categorial post-processing of MWEs and phrasal idioms, and multi-stage processing of them in the lexicon, as done by [10, 33], may be unnecessary if we assume type substitution to be potentially having one value, and surface head-marking to be an option for polyvalent argument types. One conjecture is that any idiom in any language has to involve a predicate implicated by some predicative element in the expression to keep the meaning assembly paracompositional.

The analyses in the article can be replicated by running the CCG tool at github.com/bozsahin/ccglab. The particular fragment in the chapter is at github.com/bozsahin/ccglab-grammars/cb-ag-fg2018-grammar.