Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

8.1 Fuzzy Set Theory, Natural Language and Human Reasoning

8.1.1 Motivation and History

Fuzzy set theory is the basis of methods that can be, in general, divided into two basic classes: (a) methods with linguistic motivation, and (b) methods with non-linguistic motivation. Typical example of (b) is fuzzy clustering (Cf. e.g., [16]), or the new and very powerful fuzzy transform (See, e.g., [53]).

This paper is focused on methods (a) with linguistic motivation. We suggest, as a possibility for future research, to focus on fuzzy natural logic (FNL)—the logic of natural human reasoning for which it is typical to use natural language. Our suggestion stems from the claim that fuzzy set theory has potential to serve as a good tool for modeling of linguistic semantics. This was argued by L.A. Zadeh in many of his papers since the very beginning (Cf. e.g., [63, 65, 66, 68]). It should also be noted, that the first necessary steps towards FNL have already been done.

The problem, however, is not so easy and it requires close cooperation with linguists. Zadeh suggested two simplified paradigms: computing with words (Cf. [69]) and precisiated natural language (See [70]). In the first case, it is assumed that we should confine to a small number of special linguistic expressions. In the literature, one can meet the term “linguistic label” (Cf. [23, 26, 64]). From the linguistic point of view, these are expressions consisting of degree or evaluative adjective together with (possibly) some hedge. This model, however, is oversimplified and one encounters quite often that the authors have in mind not the given linguistic expressions but linguistically named linearly ordered evaluative categories that are used in various kinds of questionaries. These are introduced to simplify the respondent’s work. For example, instead of using the numbers 1–5, one is suggested to consider them as typical examples of very small” (1), “small” (2), “medium” (3), “big” (4) and “very big” (5). These categories are then taken as imprecise quantities whose meaning is modeled using triangular fuzzy sets. We cannot speak in this case, though, that we are using natural language.

The concept of a precisiated natural language is wider and it suggests to develop a “reasonable working formalization of the semantics of natural language without pretensions to capture it in detail and fineness.” The goal is to provide an acceptable and applicable technical solution. The concept of PNL is based on two main premises:

  1. (a)

    much of the world’s knowledge is perception based,

  2. (b)

    perception based information is intrinsically fuzzy.

It should be noted that the term perception is not considered here as a psychological term but rather as a result of intrinsically imprecise human measurement. The PNL methodology requires presence of World Knowledge Database and Multiagent, Modular Deduction Database where the former contains all the necessary information, including perception based propositions describing the knowledge acquired by direct human experience, which can be used in the deduction process. The latter contains various rules of deduction. Until recently, however, no exact formalization of PNL had been developed and so, it should be considered mainly as a reasonable methodology.

We are convinced that the potential of fuzzy set theory and fuzzy logic is strong enough to enable developing a working model of linguistic semantics and, on the basis of that, also a model of natural human reasoning. As has been convincingly argued by many authors,Footnote 1 vagueness is an unavoidable feature of natural language semantics. We argue that the idea of fuzzy sets and fuzzy logic provides a reasonable model of vagueness.Footnote 2

Recall that in one of his early papers L.A. Zadeh [67]. suggested to model the commonsense reasoning. The idea to develop a logical model of the commonsense reasoning, however, is much older and has been proposed by J. McCarthy in 1959 [29] as a part of the program of logic-based artificial intelligence. Its paradigm is to develop formal commonsense theories and systems using mathematical logics that exhibit commonsense behavior. The reason is that commonsense reasoning is a central part of human thinking and we cannot imagine a real intelligence without it. The main drawback of the up-to-date formalizations of commonsense reasoning, in our opinion, is that it neglects the vagueness present in the meaning of natural language expressions (Cf. [5] and the citations therein). Therefore, a model of commonsense reasoning based on fuzzy sets and fuzzy logic can be more realistic.

The above concept was initiated in AI. A related concept came from linguists: in 1970, G. Lakoff published a paper [21] in which he introduced the concept of natural logic with the following goals:

  • to express all concepts capable of being expressed in natural language,

  • to characterize all the valid inferences that can be made in natural language,

  • to mesh with adequate linguistic descriptions of all natural languages.

Natural logic is thus a collection of terms and rules that come with natural language and that allows us to reason and argue in it. According to G. Lakoff’s hypothesis, natural language employs a relatively small finite number of atomic predicates that take sentential complements (sentential operators) and are related to each other by meaning-postulates that do not vary from language to language. The concept of natural logic has been further developed by several authors.Footnote 3

In the following subsection and further we will try to convince the reader that it is reasonable to develop the concept of fuzzy natural logic (FNL) that continues the mentioned concept of natural logic. We will show that a good portion of work has already been done.

8.1.2 The Paradigm of FNL

If we put all the ideas above together, we come to the concept of the fuzzy natural logic as a new theory that should be based on the results of linguists, logicians and AI specialists in natural logic, logical analysis of natural language (See, e.g. [7].), and commonsense reasoning. Our suggestion for the future is to develop FNL as an extension of mathematical fuzzy logic. The partly elaborated constituents of FNL till now can be summarized as follows:

  1. (a)

    Formal theory of evaluative linguistic expressions.Footnote 4

  2. (b)

    Formal theory of fuzzy IF-THEN rules and approximate reasoning (derivation of a conclusion) [8, 10, 36, 46, 47].

  3. (c)

    Formal theory of intermediate and generalized quantifiers [9, 15, 37, 40].

Let us remark that there are some other papers whose topics relate to the topic of FNL (Cf. [20, 62]). None of them, however, can be considered as a contribution to the consistent development of FNL as a formal logical theory.

The essential constituent of FNL is a model of linguistic semantics. Many logicians and linguists (Cf. [27, 28, 57]) have argued that the first order logic is not sufficient for this task. A suitable formal system has been chosen as the basis for further development of FNL is higher-order fuzzy logic called the fuzzy type theory.

8.1.3 Fuzzy Type Theory—The Mathematical Tool for FNL

The main mathematical tool for FNL is the fuzzy type theory (FTT), that is a higher-order mathematical fuzzy logic. There are more kinds of FTT that differ in the used algebra of truth values. For FNL, the most important is the Łukasiewicz fuzzy type theory (Ł-FTT) whose algebra of truth values is formed by an MV-algebra.

In this section we very briefly outline some of the main concepts of FTT. Details and full proofs of all theorems can be found in the literature [35, 42, 43]. Let us remark that FTT generalizes the classical type theory.

Syntax of Ł-FTT

The basic syntactical objects of Ł-FTT are classical (See [1].), namely the concepts of type and formula. The types are special subscripts (denoted by Greek letters) assigned to all formulas using which we distinguish kinds of objects represented by formulas. The atomic types are \(\epsilon \) representing elements and \(o\) representing truth values. The set of all types is denoted by \(\textit{Types}\).

The language \(J\) of Ł-FTT  consists of variables \(x_{\alpha }, \ldots \), special constants \(c_{\alpha }, \ldots \) (\(\alpha \in \textit{Types}\)), the symbol \(\lambda \), and brackets. We will consider the following concrete special constants: \({\mathbf {E}}_{(o\alpha )\alpha }\) (fuzzy equality) for every \(\alpha \in \textit{Types}\), \(\mathbf {C}_{(oo)o}\) (conjunction), \(\mathbf {D}_{(oo)}\) (delta operation on truth values) and descriptions operator \(\iota _{\epsilon (o\epsilon )}\).

Formulas are formed of variables, constants (each of specific type), and the symbol \(\lambda \). As mentioned, each formula \(A\) is assigned a type (we write \(A_{\alpha }\)). A set of formulas of type \(\alpha \) is denoted by \(\textit{Form}_{\alpha }\) and a set of all formulas is \(\textit{Form}=\bigcup _{\alpha \in \textit{Types}} \textit{Form}_{\alpha }\).Footnote 5

Recall that if \(B\in \textit{Form}_{\beta \alpha }\) and \(A\in \textit{Form}_{\alpha }\) then \((BA)\in \textit{Form}_{\beta }\). Similarly, if \(A\in \textit{Form}_{\beta }\) and \(x_{\alpha } \in J\), \(\alpha \in \textit{Types}\), is a variable then \((\lambda x_{\alpha }\, A)\in \textit{Form}_{\beta \alpha }\).

The main connective is equivalence \(\equiv \) defined by \( \lambda x_{\alpha } \lambda y_{\alpha }({\mathbf {E}}_{(o\alpha )\alpha }\, y_{\alpha })x_{\alpha }\) for all types \(\alpha \in \textit{Types}\). As usual, we write \((A_{\alpha } \equiv B_{\alpha })\) instead of \((\equiv A_{\alpha })B_{\alpha }\). Note that this is a formula of type \(o\).

Further connectives are conjunction (\(\pmb {\wedge }\,\mathrel {:=}\,\lambda x_{o} \lambda y_{o}(\mathbf {C}_{(oo)o}\, y_{o})x_{o}\)), implication (\(\pmb {\Rightarrow } \,\mathrel {:=}\,\lambda x_o \lambda y_o\, (x_o\pmb {\wedge }y_o)\equiv x_o\)), negation (\(\pmb {\lnot }\mathrel {:=}\,\lambda x_o (x_o\equiv \bot )\)), strong conjunction (\( \mathop {\pmb { \& }}\nolimits \mathrel {:=}\,\lambda x_o(\lambda y_o (\pmb {\lnot }(x_o\,\pmb {\Rightarrow }\,\pmb {\lnot }y_o)))\)), disjunction (\(\pmb {\vee }\, \mathrel {:=}\,\lambda x_o(\lambda y_o (x_o\,\pmb {\Rightarrow }\, y_o)\pmb {\Rightarrow }y_o)\)) and delta (\(\pmb {\varDelta }\,\mathrel {:=}\,\lambda x_{o}\mathbf {D}_{oo}x_o\)). The general (\(\forall \)) and existential (\(\exists \)) quantifiers are also defined as special formulas [35].

The fuzzy type theory has 17 logical axioms. Most of them are introduced to characterize properties of the considered algebra of truth values. For FNL, the considered algebra is MV-algebra. There are also inference rules:  

(R):

Let \(A_{\alpha }\equiv A'_{\alpha }\) and \(B\in \textit{Form}_o\). Then infer \(B'\) where \(B'\) comes from \(B\) by replacing one occurrence of \(A_{\alpha }\), which is not preceded by \(\lambda \), by \(A'_{\alpha }\).

(N):

Let \(A_o\in \textit{Form}_o\). Then, from \(A_o\) infer \(\pmb {\varDelta }A_o\).

  The inference rules of modus ponens and generalization are derived rules in Ł-FTT. The concepts of provability and proof are defined in the same way as in classical logic. A theory \(T\) over Ł-FTT is a set of formulas of type \(o\) (\(T\subset \textit{Form}_o\)). By \(T\vdash A_o\) we mean that \(A_o\) is provable in \(T\). Many theorems characterizing syntactical properties of FTT were proved including deduction theorem and other ones.

Semantics of Ł-FTT

The truth values form an MV-algebra (See [4, 49].) extended by the delta operation. It can be seen as the residuated lattice [13, 35] \(\mathcal {L}=\langle L, \vee , \wedge , \otimes , \rightarrow , \mathbf {0}, \mathbf {1}, \varDelta \rangle \). An important special case is the standard Łukasiewicz MV\(_\varDelta \)-algebra

$$\begin{aligned} \mathcal {L}=\langle [0, 1], \vee , \wedge , \otimes , \rightarrow , 0, 1, \varDelta \rangle \end{aligned}$$
(8.1)

where

$$\begin{aligned} \begin{array}{cc} \wedge = \text {minimum}, &{}\qquad \vee = \text {maximum},\\ a\otimes b = \max (0, a+b-1 ), &{}\qquad a\rightarrow b = \min (1, 1-a+b),\\ \lnot a = a\rightarrow 0 =1-a, &{}\qquad \varDelta (a) = {\left\{ \begin{array}{ll} 1&{} \text {if } a= 1,\\ 0&{}\text {otherwise}. \end{array}\right. } \end{array} \end{aligned}$$

We will also consider the operation \(a\oplus b= \min (1, a+b)\). This algebra generates the variety of MV-algebras. Therefore, the MV-operations \(\otimes , \oplus \) are often called Łukasiewicz conjunction and Łukasiewicz disjunction, respectively.

Let \(J\) be a language of Ł-FTT and \((M_{\alpha })_{\alpha \in \textit{Types}}\) be a system of sets called basic frame such that \(M_o, M_{\epsilon }\) are sets and for each \(\alpha , \beta \in \textit{Types}\), \(M_{\beta \alpha }\subseteq M_{\beta }^{M_{\alpha }}\), i.e. it is a set of functions from \(M_{\alpha }\) to \(M_{\beta }\).Footnote 6 The general frame is a tuple

$$\begin{aligned} {\mathcal {M}}=\langle \left( M_{\alpha }, =_{\alpha }\right) _{\alpha \in \textit{Types}}, \mathcal {L}_{\varDelta }\rangle \end{aligned}$$
(8.2)

so that the following holds:

  1. (i)

    The \(\mathcal {L}_{\varDelta }\) is a structure of truth values (i.e., an MV-algebra). We put \(M_o=L\) and assume that the set \(M_{oo}\cup M_{(oo)o}\) contains all the operations from \(\mathcal {L}_{\varDelta }\).

  2. (ii)

    \(=_{\alpha }\) is a fuzzy equality on \(M_{\alpha }\) and \(=_{\alpha }\in M_{(o\alpha )\alpha }\) for every \(\alpha \in \textit{Types}\).

A general model is a general frame \({\mathcal {M}}\) such that for every \(A_{\alpha }\), \(\alpha \in \textit{Types}\) interpretation \(\mathcal {M}_p\) gives

$$ {\mathcal {M}}_p(A_{\alpha })\in M_{\alpha } $$

where \(p\) is an assignment of elements from the sets \(M_{\alpha }\) to variables (depending on the given type). This means that each set \(M_{\alpha }\) from the frame \({\mathcal {M}}\) has enough elements so that the interpretation \({\mathcal {M}}_p(A_{\alpha })\) is always defined. A general model \({\mathcal {M}}\) is a model of a theory \(T\), \({\mathcal {M}}\,\models \,T\), if \({\mathcal {M}}(A_o)=\mathbf {1}\) holds for all axioms of \(T\). If \(A_o\) is true in the degree \(\mathbf {1}\) in all general models of \(T\) then we write \(T\,\models \,A_o\).

Let \(T\) be a theory. A formula \(A_o\) is true in the degree \(a\in L\) in \(T\), if

$$\begin{aligned} a=\bigwedge \{{\mathcal {M}}_p(A_o)\mid {\mathcal {M}}\,\models \,T, p\in Asg({\mathcal {M}})\}. \end{aligned}$$
(8.3)

In this case, will write \(T\,\models _a\,A_o\). If \(a=1\) then we omit the subscript.

The following completeness theorem was proved (See [35, 43]).

Theorem 1

(completeness)

  1. (a)

    A theory \(T\) is consistent iff it has a general model \({\mathcal {M}}\).

  2. (b)

    For every theory \(T\) and a formula \(A_o\)

    $$ T\vdash A_o\iff T\,\models \,A_o. $$

The completeness theorem is an important result assuring us that FTT is a well founded mathematical tool that can be used for the development of FNL. We are thus able to formulate many results syntactically and, at the same time, to be sure that our results hold in all models. Hence, FNL is very powerful and encompasses most results in fuzzy set theory obtained semantically. Consequently, we should always try to formulate our problem syntactically and then use it in semantic interpretation.

8.1.4 Future Prospects of Fuzzy Set Theory in Linguistic Modeling

As one possible direction in the future development of the fuzzy set theory and fuzzy logic we suggest to focus on fuzzy natural logic. This logic should be developed as an extension of the mathematical fuzzy logic in narrow sense.

The work on FNL consists of two tasks: (a) development of mathematical model of linguistic semantics and (b) characterization of fundamental reasoning schemes of human mind. Both tasks require close cooperate with linguists and logicians.

Natural language, however, is extremely complicated structure with many subtleties and exceptions. Even simple linguistic units, such as evaluative adjectives (e.g., “good, interesting”, etc.) can be used in many contexts and ways, their meaning may depend on the position within topic-focus articulation (Cf. [14]) and so, at present stage we can hardly hope to be able to develop a mathematical model of the semantics of natural language that would capture all the details. It is therefore, questionable whether we should struggle for complete capturing semantics of natural language. It seems reasonable to relax our requirements and in line with the paradigm of Zadeh’s precisiated natural language focus on smaller parts of natural language and try only to capture their essential properties, of course, with the perspective to improve and deepen continuously the model in parallel with the increase of linguistic knowledge. Our temporary goal should be to develop the model to such an extent that would make it possible to apply it in various technical and economical problems.

8.2 Linguistic Semantics and FNL

In this section, we will outline some aspects of linguistic semantics and relate them to the existing results in FNL. Let us remark that the model of semantics of FNL stems from the ideas presented in the book [33] on the Alternative mathematical model of linguistic semantics (AML).

We will turn our attention to a selection of specific linguistic units and phenomena, namely nouns, adjectives, adverbs, hedging and simple noun phrases and other ones. Our goal is to remind wealth and complexity of natural language and outline some of the problems and results of linguistic studies. In each case we at the same time outline how FNL, in its present state, copes with some of the discussed linguistic phenomena.

We will use the means of fuzzy type theory. Recall that FTT assigns a type to each formula. To make the notation more readable and transparent, we often introduce the type of a given formula only in the first occurrence and then write it without the type assuming that the reader still keeps it in mind.

8.2.1 Nouns and Objects

Nouns are names of objects. More specifically, they denote persons, places, things, events, substances, qualities, quantities, etc. Nouns are original, e.g., “house, horse, square”, etc. and derived, namely from adjectives, e.g. “redness, beauty, simplicity”, or verbs (we speak about postverbal nouns), e.g. “jumping, walk, writing”, etc. There are also proper nouns denoting one specific object, e.g., “Earth, Saturn, Russia” and common ones that denote a class of objects, e.g., “planet, mammal, street”, etc.

Countable nouns are common nouns that can take a plural, can be combined with numerals or counting quantifiers (e.g., one, two, several, every, most), and can take an indefinite article such as “a” or “an” (in languages which have such articles). Examples of count nouns are “chair, nose, occasion”. Mass nouns or uncountable (or non-count) nouns differ from count nouns in precisely that respect: they cannot take plurals or combine with number words or the above type of quantifiers. For example, it is not possible to refer to a “furniture or three furnitures”. Depending on the kind of objects, we can also distinguish concrete (e.g., “horse, table, house”, etc.) and abstract nouns (e.g., “work, idea, feeling, happiness”, etc.).

Objects are entities that can have very complicated properties. In general, an object is a phenomenon to which we concede its individuality that makes it distinct from its surrounding. In fact, we can construe arbitrary phenomenon as an object.

Till now, there is no more detailed model of nouns in FNL (or in general fuzzy set theory). The problem is in finding a satisfactory model of real objects because of too high complexity of them. It is possible to model some very special objects, such as 2-D or 3-D geometrical shapes, or so. But in full generality that includes also abstract objects such as those considered in postverbal nouns is this task so far too complicated.

In FNL, we suggest a simplification that can work in various AI applications and elsewhere. Namely, note that each object can be characterized by various kinds of features (characteristics), for example, “height, nationality, age, weight, strength, shape, intelligence”, etc. Therefore, we can identify objects with sets of values of features. Semantically, an object can be represented by a tuple

$$\begin{aligned} {\mathbf {o}} = \langle v_1, v_2, \ldots , v_n\rangle \end{aligned}$$
(8.4)

where \(i=1, \ldots , n\) are various features characterizing the object and \(v_i\) are their values. The \(n\) can in principle be even infinity. For example, if \(i\) is a feature “length” then, for example, \(v_i=1.2\) m. We can see that the values can be real numbers, or some other kinds of characteristics. Since the set of real numbers is rich enough to represent all kinds of values, we will in practice usually take \(v_i\in \mathbb {R}\).

A fuzzy set of objects \({\mathbf {o}}\) in (8.4) forms an extension of a noun if it is determined with respect to a certain context (possible world). Then an intension of a noun is represented by a function from the set of all contexts into a set of all objects.

The objects and nouns can be expressed in the syntax of FTT as follows. First, we must introduce special types:

  1. (a)

    \(\varphi \)—features of objects. Any formula \(A_{\varphi }\) represents some specific feature. Given a model, the interpretation \({\mathcal {M}}(A_{\varphi })\in M_{\varphi }\) is a unique object representing one specific feature. In (8.4) it is a given \(i\in \{1, \ldots , n \}\).

  2. (b)

    \(\alpha \)values of concrete features. As mentioned above, in the model we will usually take \(M_{\alpha }=\mathbb {R}\).

  3. (c)

    Given a feature, it may attain various values dependently on its local context. Therefore, we will introduce a special type \(\omega \) for local context of values of one specific feature. Recall that this type may even be itself more complex, for example, we may put \(\omega \mathrel {:=}\,\alpha o\).

  4. (d)

    \(\omega \varphi \)—global context which covers all features and can be taken as a context of the whole noun. We will use the variable \({\mathbf {w}}_{\omega \varphi }\) for global contexts. In a model \({\mathcal {M}}\), the interpretation \({\mathcal {M}}({\mathbf {w}}_{\omega \varphi })\in M_{\omega \varphi }\) is a function that assigns to each feature from \(M_{\varphi }\) a context \(w\in M_{\omega }\).

  5. (e)

    \((\alpha \omega )\varphi \)—type of objects that are elements of an extension of a noun. The objects are represented by sets of values of features in a global context. We will use the variable \(\mathbf {h}_{(\alpha \omega )\varphi }\) for objects. In a model \({\mathcal {M}}\), the interpretation \({\mathcal {M}}(\mathbf {h}_{(\alpha \omega )\varphi })\in M_{(\alpha \omega )\varphi }\) is the tuple of the form (8.4). For example, \({\mathcal {M}}(\mathbf {h}_{(\alpha \omega )\varphi })\) can be a Swede.

  6. (f)

    \(((o \alpha ) \omega )\varphi \)—type of a noun. Any formula \(\mathbf {S}_{((o \alpha ) \omega )\varphi }\) represents a formal way how a noun is construed. Its interpretation in a model \({\mathcal {M}}\) is a function which assigns to each feature from \(M_{\varphi }\) its intension.

Semantics of nouns

In accordance with the results of analysis done in linguistics and logic, semantics of expressions of natural language is characterized by the concepts of possible world, intension and extension. Then, we can formalize semantics of nouns in FNL as follows.

Let \(\mathbf {S}_{((o \alpha ) \omega )\varphi }\) be a formula representing a Noun. For example, Noun can be Swede, plate, house, etc. Intension of Noun is defined by

$$\begin{aligned} \mathop {\mathrm{Int}}\nolimits (\mathbf{Noun})\mathrel {:=}\,\lambda {\mathbf {w}}_{\omega \varphi }\, \lambda \mathbf {h}_{(\alpha \omega )\varphi }\,\cdot (\forall c_{\varphi }) (\mathbf {S} c\, ({\mathbf {w}}c)(\mathbf {h}c({\mathbf {w}}c))). \end{aligned}$$
(8.5)

Thus, in a model \({\mathcal {M}}\), intension (8.5) of Noun is interpreted by a function assigning to each global context \({\mathcal {M}}({\mathbf {w}})\) a fuzzy set of objects \({\mathcal {M}}(\mathbf {h})\). It can be seen from (8.5) that each feature \({\mathcal {M}}(c_{\varphi })\) is in a local context \({\mathcal {M}}({\mathbf {w}}c)\) assigned a value \({\mathcal {M}}(\mathbf {h}c({\mathbf {w}}c))\) which also depends on the context \({\mathcal {M}}({\mathbf {w}}c)\).

It follows from (8.5) that extension of Noun in a (global) context \({\mathbf {w}}\) is

$$\begin{aligned} \mathop {\mathrm{Ext}}\nolimits _{{\mathbf {w}}}(\mathbf{Noun})\mathrel {:=}\,\lambda \mathbf {h}_{(\alpha \omega )\varphi }\, \cdot (\forall c_{\varphi })(\mathbf {S} c ({\mathbf {w}}c)(\mathbf {h}c({\mathbf {w}}c))). \end{aligned}$$
(8.6)

Clearly, in a model \({\mathcal {M}}\) it is a fuzzy set of elements \({\mathcal {M}}(\mathbf {h}_{(\alpha \omega )\varphi })\).

In this model, we easily obtain semantics of “a Noun” and “the Noun”,Footnote 7 for example, “a Swede” and “the Swede”. In the former case, for a given context, the interpretation is an element from the kernel of the fuzzy set (8.6) and in the latter case it is one specific object taken from its support (see below).

8.2.2 Adjectives

Adjectives are names of properties of objects. We can distinguish proper and derived adjectives. Example of the former are “red, tall, good”, the latter are derived from other types of words, namely nouns (e.g., wood–wooden, grease–greasy) or verbs (deverbal adjectives), for example e.g., “smiling, washing.

We can distinguish also are other specific classes of adjectives. Important are (See [3].) gradable adjectives [2] (also called degree adjectives), for example “hot”“small, tall”, evaluative adjectives [59], for example “good, awful, fantastic, disasterous”, and absolute (non-gradable) adjectives “green, freezing, dead, nuclear”. The gradable adjectives can still be divided (See [19].) into absolute, for example “bent, straight” and relative gradable adjectives such as “expensive, tall, strong”, etc.

The accepted hypothesis is that gradable adjectives denote functions that map objects onto representations of the degree to which they posses some gradable property. Hence, gradable adjectives may differ with respect to their scales. This corresponds with our model of objects in FNL outlined above.

Another specific feature of gradable but also of evaluative adjectives [56] are existence of pairs of antonyms, for example “short–long, clean–dirty, complete–incomplete”. Antonymous pairs of gradable adjectives can be complementary and non-complementary. Complementary adjectives are pairs of antonymous adjectives that are furthermore each other’s negation on their domain, e.g. complete–incomplete. Non-complementary adjectives may have an xtension gap which corresponds to the set of objects that the predicate is neither true nor false of in a particular context of utterance. This gives rise to what is called in FNL: the fundamental evaluative trichotomy that is a triple of expression consisting of the nominal adjective, its antonym and a middle member, for example “weak–medium (strong)–strong”, etc. —see below. Crucially, the positive and negative extensions and the extension gap of a gradable predicate may vary across contexts of use, becoming more or less precise. In FNL, we propose a model covering both gradable as well as evaluative adjectives including also hedging discussed in the next subsection.

A special phenomenon is existence of the comparative and superlative of adjectives. The comparative is a name of a relation that does not necessarily corresponds with the given adjective. For example, bigger is a name of a relation \(\ge \) that does not correspond with the adjective “big”. Clearly, if “John is small” and “Charles is bigger”, it does not imply that “Charles is big” (both John and Charles can be very small men). The comparative has degrees since we can say, for example, “much bigger, a little smaller”, etc. The superlative is derived from comparative and, in fact, it corresponds to result of maximization (in mathematical sense). No satisfactory model of comparative and superlative phenomena is in FNL so far suggested.

8.2.3 Hedging

Hedging is a linguistic phenomenon that is used to specify more or less closely the topic of utterance. In the theory of fuzzy sets, we learned about hedges being special adverbs (such as “very, roughly”). However, hedging is a wider phenomenon that can be expressed by more complex expressions.

Examples of hedging are expressions such as “to some respect, in a sense, a sort of” but also “very roughly, approximately, more or less, about, almost”. The most important in hedging is a class of adverbs called intensifying ones, among them we rank “very, extremely, typically, roughly”, etc.

The concept of hedging was in linguistics in more detail analyzed first by G. Lakoff [22]. He also noticed that the general effect of hedging is either in incerasing fuzziness (widening effect) or decreasing fuzziness (narrowing effect). Thus, hedging is an important tool of natural language that enables us to specify more concretely what we have in mind. We may distinguish narrowing hedges (very, extremely, significantly, etc.), widening hedges (more or less, roughly, very roughly, etc.) and specifying hedges (approximately, about, rather, precisely, etc.).

8.2.4 Evaluative Linguistic Expressions in FNL

The analysis of nouns, adjectives and hedges gives rise to a more general concept. Namely, we can introduce a special class of linguistic expressions called evaluative ones. When putting together properties of gradable and evaluative adjectives discussed in the literature, we can specify evaluative linguistic expressions as special expressions of natural language using which people evaluate phenomena around them.

They include them the following classes of linguistic expressions:

  1. (i)

    \(\langle \text {TE-adjective}\rangle \), that belong to a class of special adjectives (TE stands for “trichotomic evaluative”) that include gradable adjectives (big, cold, deep, fast, friendly, happy, high, hot, important, long, popular, rich, strong, tall, warm, weak, young), evaluative adjectives (good, bad, clever, stupid, ugly, etc.), but also adjectives such as left, middle, medium, etc. The TE-adjectives can usually be grouped to form a fundamental evaluative trichotomy that consists of two antonyms and a middle member, for example low, medium, high; clever, average, stupid; good, normal, bad, etc. The triple of adjectives small, medium, big will further be taken as canonical. An exception are complementary adjectives mentioned above that lack the middle member.

  2. (ii)

    Fuzzy numbers. These include all linguistic expressions containing some number that is often completed by some hedge, for example “three hundred, roughly one hundred, about twenty five, approximately two million”, etc.

  3. (iii)

    Simple evaluative linguistic expressions (possibly with signs). They have a general form

    $$\begin{aligned} \langle \text {linguistic hedge}\rangle \langle \text {TE-adjective}\rangle . \end{aligned}$$
    (8.7)

    From the logical point of view, it is reasonable to introduce also empty hedge. Then we can consider as simple evaluative expressions also pure adjectives. Hence, examples of simple evaluative expressions are “small, rather medium, very big, more or less weak, medium strong, strong, quite silly, normal, extremely intelligent”, etc. Note that from grammatical point of view simple evaluative expressions are adjective phrases. Note also, that we cannot apply the same hedge with all adjectives. For example “very medium” has no meaning and so, it is not an evaluative expression.

  4. (iv)

    Compound evaluative expressions (roughly small or medium, small but not very (small), etc.). These expressions are formed from simple ones using connectives. However, these expressions never form a boolean structure since there are many combinations that have no sense. For example, the expression “very small or medium and extremely big” has no meaning.

  5. (v)

    Negative evaluative expressions (not small, not very big, etc.). The use of negation is problematic and one encounters here a special linguistic phenomenon called topic-focus articulation (Cf. [14]). Namely, the particle “not” can act at least in two ways—either on the whole evaluative expression or only on the hedge. For example, not very small has (at least) two different meanings: either “(not very) small” where “very is negated” so that we deal with a new hedge “not very”, or “not (very small)” where the whole expression “very small” is negated.

In the applications of fuzzy logic, evaluative linguistic expressions occur in the expressions of the form

$$\begin{aligned} X \text { is } \langle \text {evaluative expression}\rangle \end{aligned}$$
(8.8)

where \(X\) is a variable whose values are the values of some measurable feature of the noun and \(\mathcal {A}\) is an evaluative expression. They are a simplified from of a special class of verb phrases that are called evaluative linguistic predications. From linguistic point of view they are simple phrases of the form

$$\begin{aligned} \langle \text {noun}\rangle \text { is } \langle \text {evaluative expression}\rangle \end{aligned}$$
(8.9)

where “is” is a copula—the verb “to be”. Examples are “temperature is low, very intelligent man, more or less weak force, medium tension, extremely long bridge, short distance and pleasant walk, roughly small or medium speed, etc.).

Evaluative predications semantically express a property of object(s) characterized by the given evaluative expression. If noun is concretely specified (e.g., John, my friend) then the meaning of (8.9) is a truth value. If it is general, e.g., “house is big” (i.e., without specification, which house) then the meaning of (8.9) (and also of (8.8)) is extension of all objects having the property denoted by \(\langle \text {evaluative expression}\rangle \). In this case, the meaning of (8.9) is equal to the meaning of

$$ \langle \text {evaluative expression}\rangle \langle \text {noun}\rangle , $$

for example “big house”.

In more general way, evaluative linguistic expressions occur in the position of adjective phrases characterizing features of some objects and are used either in predicative or attributive role.

Example of the predicative use is “The man is very stupid”. Example of the attributive one is “The very stupid man climbed a tree”. The purpose of the first sentence is simply to communicate a particular quality of the sentence’s subject. The purpose of the second sentence is primarily to tell us what the subject did i.e. climbed a tree; that the subject is very stupid is a secondary consideration.

Semantics of evaluative expressions in FNL

Formalization of the semantics of evaluative expressions in FNL is based on the standard assumptions of the theory of semantics developed in linguistics and logic (Cf. [14, 27, 28]). Namely, the fundamental concepts to be formalized are possible world, intension, and extension. This task is in FNL solved by introducing a special theory \(T^{\text {Ev}}\) that is a special formal theory of Łukasiewicz fuzzy type theory (Ł-FTT). This theoryFootnote 8 formalizes certain general characteristics of the semantics of evaluative expressions.

Let us remark that the model of semantics of evaluative expressions is very successful in applications. One of the reasons is that this semantics is based on the theory of ordered sets that is a well elaborated part of mathematics and it is relatively easy to construct the necessary models.

On e of essential concepts in the theory of semantics of natural language expressions is that of possible world. This concept can be traced back to Leibniz and in modern conception to Carnap as well as logicians such as Quine, Wittgenstein, Lewis, Kripke and many others.

In the theory of evaluative expressions, we will speak about context instead of a possible world. The latter is usually taken as a state of the world at a given point in time and space. It is very difficult to formalize such a definition. However, in [39], it is argued that extensions of evaluative expressions are classes of elements taken from some scale representing. Therefore, we can introduce a simplified concept of context that is a nonempty, linearly ordered and bounded set, in which three distinguished limit points can be determined: a left bound \(v_L\), a right bound \(v_R\), and a central point \(v_S\). Hence, each context is identified with an ordered triple

$$ w = \langle v_L, v_S, v_R\rangle $$

where \(v_L, v_S, v_R\in U\). A straightforward example is the predication “\(\mathcal {A}\) town”, for example “small town”, “very big town”, etc. Then, the corresponding context for the Czech Republic can be \(\langle 3~000, 50~000, 1~000~000\rangle \), while for the USA it can be \(\langle 30~000, 200~000\), \(10~000~000\rangle \).

We introduce a set \(W\) of contexts. Each element \(w\in W\) gives rise to an interval \(w= [v_L, v_R]\subset U\).

Intension \(\mathop {\mathrm{Int}}\nolimits (\mathcal {A})\) of an evaluative expression \(\mathcal {A}\) is a property that attains various truth values in various contexts but is invariant with respect to them. Therefore, it is modeled as a function

$$\begin{aligned} \mathop {\mathrm{Int}}\nolimits (\mathcal {A}): W\rightarrow \mathcal {F}(U) \end{aligned}$$
(8.10)

where \(\mathcal {F}(U)\) is a set of all fuzzy sets over \(U\). Note that of an evaluative expression or predication \(\mathcal {A}\) is obtained as interpretation of a formula \(\lambda w\,\lambda x\,(A w) x\) (in the language of Ł-FTT) in a special model \({\mathcal {M}}\).

Extension \(\mathop {\mathrm{Ext}}\nolimits _w(\mathcal {A})\) of an evaluative expression \(\mathcal {A}\) in the context \(w\in W\) is a fuzzy set of elements

In our example, the truth value of a “small town having 30 000 inhabitants” could be, for example, 0.7 in the Czech Republic and 1 in the USA.

In the theory \(T^{\text {Ev}}\), the extension of an evaluative expression is obtained as a shifted horizon where the shift corresponds to a linguistic hedge, which is thus modeled by a function \(L\rightarrow L\). A graphical scheme of such an interpretation in a specific context can be seen in Fig. 8.1.

Fig. 8.1
figure 1

Graphical scheme of a construction of extensions of evaluative expressions in a given context. Each extension is obtained as a composition of a function representing a respective horizon \(\mathop { LH}\nolimits , \mathop { MH}\nolimits , \mathop { RH}\nolimits \) (in the figure, it is linear because of the use of Ł-FTT) and the deformation function \(\nu _{a,b,c}\) whose graph is for convenience depicted turned 90\(^{\circ }\) anticlockwise

Recall also our discussion above about gradable adjectives. We can distinguish absolute and relative ones. What is the difference between them? From our point of view, the solution is simple: absolute gradable adjectives have only one context and so, their intension coincides with their extension while the relative ones have (infinitely) many of contexts and so, their intension is the function (8.10).

8.2.5 Linguistic Quantifiers and Determiners

A special phenomenon in natural language is a huge and apparently messy collection of expressions such as “not just every and some, but most, few, between five and ten, a lot of”, and many others. These expressions occur in typical sentences, for instance:

  1. (a)

    Few (both, enough, at lest ten, all but five) students attended the party.

  2. (b)

    More male than female students attended the party

  3. (c)

    John’s mother arrived.

  4. (d)

    Every student attended the party.

These sentences have the following standard syntactic structure:

figure a

The Det is a determiner that is a generalized (linguistic) quantifier. The NP is noun phrase. In our case it is a quantified noun phrase, also called determiner phrase.

In linguistic semantics, a generalized quantifier is an expression that denotes a property of a property (a higher-order property).

In linguistics, a determiner phrase (DP) is a type of phrase posited by some theories of syntax. The head of a DP is a determiner, as opposed to a noun. For example in the phrase “the car”, “the” is a determiner and “car” is a noun; the two combine to form a phrase, and on the DP-analysis, the determiner “the” is head over the noun “car”.

8.2.6 Intermediate (Fuzzy) Quantifiers in FNL

In logic, the linguistic quantifiers are modeled using the concept of generalized quantifier [18, 24, 31, 54, 61]. These were in fuzzy set theory generalized under the name fuzzy (generalized) quantifiers. The first paper in this topic was written by L.A. Zadeh [68]. His theory has been further elaborated by several authors (See, e.g., [9, 12, 15, 17]).

The theory, in the mentioned papers is focused on computation rather than on linguistics. The suggestion that took into account linguistic side and thus contributes to the theory of FNL was introduced by Novák [40]. This theory was inspired by the theory of intermediate quantifiers studied in detail by Peterson [55]. These are linguistic quantifiers whose meaning layes between classical and existential quantifiers. The basic idea consists in the assumption that these quantifiers are classical general or existential quantifiers for which the universe of quantification is modified and the modification can be imprecise.

We introduce a theory \(T^{\text {IQ}}\) which is a special theory of Ł-FTT extending the theory \(T^{\text {Ev}}\) of evaluative linguistic expressions introduced in the previous section by few more axioms, namely those for the measure function (see below).

The theory \(T^{\text {IQ}}\) is obtained from \(T^{\text {Ev}}\) by extending the latter by the concept of measure of fuzzy sets. In the frame of Ł-FTT, it can be introduced syntactically. Namely, the measure is represented by a special formula \(\mu \in \textit{Form}_{o(o\alpha )(o\alpha )}\) whose interpretation is a function \(M_{\alpha }\rightarrow L\), i.e. values of the measure are taken from the set of truth values.Footnote 9 Moreover, the measure is noremed with respect to some reference fuzzy set (recall that a formula of type \(o\alpha \) represents a function \(M_{\alpha }\rightarrow L\), i.e. a fuzzy set). Namely,

$$ \mu (z_{o\alpha }) x_{o\alpha } $$

represents a measure of a fuzzy set \(x_{o\alpha }\) normed with respect to the fuzzy set \(z_{o\alpha }\) (i.e., \(x_{o\alpha }\) is proportional to \(z_{o\alpha }\)). Its properties properties (and interpretation) can be found in the cited literature.

Definition 1

Let \(\mathop { Ev}\nolimits \in \textit{Form}_{oo}\) be intension of some evaluative expression, \(A,B\in \textit{Form}_{o\alpha }\) be formulas and \(z\in \textit{Form}_{o\alpha }\) and \(x\in \textit{Form}_{\alpha }\) variables where \(\alpha \in \mathcal {S}\). Then a type \(\langle 1, 1\rangle \) intermediate generalized quantifier interpreting the sentence

\(\langle \text {Quantifier}\rangle \) \(B\)’s are \(A\)

is one of the following formulas:

$$ \begin{aligned} (Q^{\forall }_{\mathop { Ev}\nolimits }\, x) (B, A)&\equiv (\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\forall x)(z\, x\,\pmb {\Rightarrow }\,A x))\nonumber \\&\qquad \qquad \pmb {\wedge }\mathop { Ev}\nolimits ((\mu B) z)).\end{aligned}$$
(8.11)
$$ \begin{aligned} (Q^{\exists }_{\mathop { Ev}\nolimits }\, x) (B, A)&\equiv (\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\exists x)(z\, x\,\pmb {\wedge }\,A x))\nonumber \\&\qquad \qquad \pmb {\wedge }\mathop { Ev}\nolimits ((\mu B) z)). \end{aligned}$$
(8.12)

For some syllogism figure, also presupposition requiring that only non-empty (fuzzy) subsets of \(B\) are considered.

Note that each formula above consists of three parts:

$$ \begin{aligned}&\underbrace{(\exists z)((\pmb {\varDelta }(z\subseteq B)}_{\text {``the greatest'' part}\,\text {of}\, B{\text {'s}}} \mathop {\pmb { \& }}\nolimits \\&\qquad \qquad \qquad \qquad \qquad \underbrace{(\forall x)(z\, x\,\pmb {\Rightarrow }\,A x))}_{\text { each of}\,B\text {'s has}\,A}\pmb {\wedge }\\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \underbrace{\mathop { Ev}\nolimits ((\mu B) z))}_{\text { size of}\,z\,\text {is evaluated by}\mathop { Ev}\nolimits } \end{aligned}$$

Thus, the concrete quantifiers are obtained when specifying the evaluative expression \(\mathop { Ev}\nolimits \).

Below are introduced several specific intermediate quantifiers based on the analysis provided by Peterson (Cf. [55]).

$$ \begin{aligned}&\mathbf{A}\mathbf{:}\text { All }B\text { are }A\mathrel {:=}\,Q^{\forall }_{\mathop { Bi}\nolimits \!\pmb {\varDelta }}(B, A)\equiv (\forall x)(Bx\,\pmb {\Rightarrow }\,Ax),\\&\mathbf{E}\mathbf{:}\text { No }B\text { are }A\mathrel {:=}\,Q^{\forall }_{\mathop { Bi}\nolimits \!\pmb {\varDelta }}(B, \pmb {\lnot }A)\equiv (\forall x)(Bx\,\pmb {\Rightarrow }\,\pmb {\lnot }Ax),\\&\mathbf{P}\mathbf{:}\text { Almost all }B\text { are }A \mathrel {:=}\,Q^{\forall }_{\mathop { Bi}\nolimits \textit{Ex}}(B, A)\equiv \\&(\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\forall x)(zx\,\pmb {\Rightarrow }\,Ax))\,\pmb {\wedge }\,(\mathop { Bi}\nolimits \textit{Ex})((\mu B) z)),\\&\qquad \qquad \qquad {({\textit{extremely big part of B has A}})}\\&\mathbf{B}\mathbf{:}\text { Few }B\text { are }A\; \mathrel {:=}\,Q^{\forall }_{\mathop { Bi}\nolimits \textit{Ex}}(B, \pmb {\lnot }A)\equiv \\&(\exists z)((\pmb {\varDelta }(z\subseteq B) \mathop {\pmb { \& }}\nolimits (\forall x)(zx\,\pmb {\Rightarrow }\,\pmb {\lnot }Ax))\,\pmb {\wedge }\,(\mathop { Bi}\nolimits \textit{Ex})((\mu B) z)) ,\\&\qquad \qquad \qquad {(\textit{extremely big part of B does not have A})}\\&\mathbf{T}\mathbf{:} \text { Most }B\text { are }A \mathrel {:=}\,Q^{\forall }_{\mathop { Bi}\nolimits \textit{Ve}}(B, A)\equiv \\&(\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\forall x)(zx\,\pmb {\Rightarrow }\,Ax))\,\pmb {\wedge }\,(\mathop { Bi}\nolimits \textit{Ve})((\mu B) z)) ,\\&\qquad \qquad \qquad \qquad \quad {(\textit{very big part of B has A})}\\&\mathbf{D}\mathbf{:} \text { Most }B\text { are not }A\mathrel {:=}\,Q^{\forall }_{\mathop { Bi}\nolimits \textit{Ve}}(B, \pmb {\lnot }A)\equiv \\&(\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\forall x)(zx\,\pmb {\Rightarrow }\,\pmb {\lnot }Ax))\,\pmb {\wedge }\,(\mathop { Bi}\nolimits \textit{Ve})((\mu B) z)) ,\\&\qquad \qquad \qquad \qquad {(\textit{very big part of B does not have A})}\\&\mathbf{K}\mathbf{:} \text { Many }B\text { are }A\mathrel {:=}\,Q^{\forall }_{\pmb {\lnot }(\mathop { Sm}\nolimits \bar{\mathop {\pmb {\nu }}\nolimits })}(B, A)\equiv \\&(\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\forall x)(zx\,\pmb {\Rightarrow }\,Ax))\,\pmb {\wedge }\pmb {\lnot }(\mathop { Sm}\nolimits \bar{\mathop {\pmb {\nu }}\nolimits })((\mu B) z)) ,\\&\qquad \qquad \qquad \qquad {(\textit{not small part of B has A})}\\&\mathbf{G}\mathbf{:} \text { Many }B\text { are not }A\mathrel {:=}\,Q^{\forall }_{\pmb {\lnot }(\mathop { Sm}\nolimits \bar{\mathop {\pmb {\nu }}\nolimits })}(B, \pmb {\lnot }A) \equiv \\&(\exists z)((\pmb {\varDelta }(z\subseteq B)\mathop {\pmb { \& }}\nolimits (\forall x)(zx\,\pmb {\Rightarrow }\,\pmb {\lnot }Ax))\,\pmb {\wedge }\pmb {\lnot }(\mathop { Sm}\nolimits \bar{\mathop {\pmb {\nu }}\nolimits })((\mu B) z)),\\&\qquad \qquad \qquad \quad {(\textit{not small part of B does not have A})}\\&\mathbf{I}\mathbf{:} \text { Some }B\text { are }A \mathrel {:=}\,Q^{\exists }_{\mathop { Bi}\nolimits \!\pmb {\varDelta }}(B, A)\equiv (\exists x)(Bx\,\pmb {\wedge }\,Ax),\\&\mathbf{O}\mathbf{:} \text { Some }B\text { are not }A \mathrel {:=}\,Q^{\exists }_{\mathop { Bi}\nolimits \!\pmb {\varDelta }}(B, \pmb {\lnot }A)\equiv (\exists x)(Bx\,\pmb {\wedge }\pmb {\lnot }Ax). \end{aligned}$$

Remark 3

  1. (i)

    The evaluative expressions used in the definition of the quantifiers above are considered in the abstract context \(w_{oo}\) and so, the variable \(w_{oo}\) is omitted in the corresponding formulas.

  2. (ii)

    The quantifier B is, in fact, defined as “Almost all \(B\) are not \(A\)”.

  3. (iii)

    The quantifier “most” is considered in its shifted meaning as “relatively close to all” and not as “simple majority”.

  4. (iv)

    The quantifiers with the hedge \(\pmb {\varDelta }\) are equivalent to the corresponding classical ones.

8.2.7 The Meaning of Noun Phrases and Simple Sentences

Using the formal means of FNL, we can construct the meaning of simple noun phrases or simple sentences. Our construction in this section will be syntactical. However, after defining a model, it is straightforward to construct the concrete fuzzy relations representing the meaning of the given noun phrase or a sentence.

We will first demonstrate our construction on a simple example of the noun phrase Very tall Swedes.

Special types

Let us first introduce the following special types:

  1. (a)

    \(\beta \)—features (characteristics) of objects. These can be, for example, “height, nationality, age, weight”, and many other ones. In this section, we will consider only the first two ones.

  2. (b)

    \(\alpha \)values of concrete features. These are often real numbers, or some other kinds of characteristics.

  3. (c)

    \(\alpha \beta \)—objects (people), i.e. people are in our model identified with sequences of values of features.

Special constants and variables

Recall that nouns are identified with functions \(f_{\alpha \beta }\) that can be seen as sets of values of features. Each feature \(c_{\beta }\) is a assigned some value \(v_{\alpha }\) via the function \(f_{\alpha \beta }\). Iterpretation of \(f_{\alpha \beta }\) in any model \({\mathcal {M}}\) can be seen as the set

$$\begin{aligned} {\mathcal {M}}(f_{\alpha \beta }) = \{\langle c, v\rangle , \langle c', v'\rangle , \ldots \} \end{aligned}$$
(8.13)

where \(c, c',\ldots \in M_{\beta }, \ldots \) are various features and \(v, v', \ldots \in M_{\alpha } \ldots \) are their values. The sets (8.13) represent people in this model. Furthermore:

  1. (i)

    The constant \(\mathbf {h}_{\beta }\) is the feature of height.

  2. (ii)

    The constant \(\mathbf {n}_{\beta }\) is the feature of nationality. Of course, this is a simplification. Instead, we could consider a set of several specific characteristics, such as mother tongue, hair, skin, etc. We do not need such a complicated model in this example.

  3. (iii)

    The variable \(\mathop {\pmb {\nu }}\nolimits _{oo}\) is a linguistic hedge. It is used as a general variable standing for specific hedges (very, roughly, etc.).

  4. (iv)

    The variable \(w_{\alpha o}\) is a context (possible world). Its interpretation is a function from the set of truth values into a set \(M_{\alpha }\) of type \(\alpha \). This trick enables to transfer ordering properties of the algebra of truth values into the set \(M_{\alpha }\).

Every Swede is a human being and so, we must first introduce the property “to be human”:

$$\begin{aligned} \mathop {\mathrm{Int}}\nolimits (\mathbf{Human})\mathrel {:=}\,\lambda {\mathbf {w}}_{\omega \varphi }\, \lambda \mathbf {h}_{(\alpha \omega )\varphi }\,\cdot (\forall c_{\varphi }) (\mathbf {H} c ({\mathbf {w}}c)(\mathbf {h}c({\mathbf {w}}c))) \end{aligned}$$
(8.14)

where \(\mathbf {H}_{(o\alpha \omega )\varphi }\) is a formula representing people.

Among all features \(c_{\varphi }\) we may distinguish also a feature of nationality \(n_{\varphi }\) and height \(v_{\varphi }\). Let the nationality of Swedes \(n_{\varphi }\) be determined by the value (constant) \(\mathbf {d}_{\alpha }\). Then intension of “to be Swede” is

$$ \begin{aligned} \mathop {\mathrm{Int}}\nolimits (\text {Swedes})\mathrel {:=}\,\lambda {\mathbf {w}}_{\omega \varphi }\, \lambda \mathbf {h}_{(\alpha \omega )\varphi }\,\cdot (\forall c_{\varphi }) (\mathbf {H} c ({\mathbf {w}}c)(\mathbf {h}c({\mathbf {w}}c)))\mathop {\pmb { \& }}\nolimits (\mathbf {h}n({\mathbf {w}}n)\equiv \mathbf {d}_{\alpha }).\nonumber \\ \end{aligned}$$
(8.15)

Thus, intension of Swedes assigns to each context \({\mathbf {w}}_{\omega \varphi }\) a fuzzy set of people whose nationality \(\mathbf {h}n({\mathbf {w}}n)\) in the context \({\mathbf {w}}n\) corresponding to nationality has the value \(\mathbf {d}_{\alpha }\), i.e. the nationality of Swedes.

To express intension of Very tall Swedes, we must consider the height \(v_{\varphi }\) and characterize the truth of the proposition the height is very big in the context \({\mathbf {w}}v\):

$$ \begin{aligned} \mathop {\mathrm{Int}}\nolimits (\text {Very tall Swedes})&\mathrel {:=}\,\lambda {\mathbf {w}}_{\omega \varphi }\, \lambda \mathbf {h}_{(\alpha \omega )\varphi }\,\cdot (\forall c_{\varphi }) (\mathbf {H} c ({\mathbf {w}}c)(\mathbf {h}c({\mathbf {w}}c)))\mathop {\pmb { \& }}\nolimits \nonumber \\&\qquad (\mathbf {h}n({\mathbf {w}}n)\equiv \mathbf {d}_{\alpha })\mathop {\pmb { \& }}\nolimits (\mathop { Bi}\nolimits \textit{Ve})({\mathbf {w}}v)(\mathbf {h}v({\mathbf {w}}v)). \end{aligned}$$
(8.16)

Thus, this intension is a function assigning to each (global) context a fuzzy set of people whose nationality is to be Swede and whose height \(\mathbf {h}v({\mathbf {w}}v)\) in the context \({\mathbf {w}}v\) is very big (i.e., they are very tall).

Now we can also construct the meaning of a simple sentenceFootnote 10:

$$ \langle \text {Quantifier}\rangle \,\,\textit{Swedes are tall}. $$

First we will introduce the formula Swede using \(\lambda \)-conversion: \(\text {Swede}\equiv \mathop {\mathrm{Int}}\nolimits (\text {Swedes})\) \({\mathbf {w}}\mathbf {h}\). Then intension of the proposition (sentence) “All Swedes are very tall” is

$$ \lambda {\mathbf {w}}_{\omega \varphi }\,\cdot (\forall \mathbf {h}_{(\alpha \omega )\varphi }) \text {Swede}\,{\mathbf {w}}\mathbf {h}\,\pmb {\Rightarrow }\,(\mathop { Bi}\nolimits \textit{Ve})({\mathbf {w}}v)(\mathbf {h}v({\mathbf {w}}v)). $$

Its interpretation in a model \({\mathcal {M}}\) is a function which assigns to each context (possible world \(w_{o\alpha }\)) a truth value. We thus obtain intensions of various kinds of propositions, such as “All Swedes are extremely tall ”, “All Swedes are more or less tall ”, etc.

Finally, we will analyze the proposition Most Swedes are tall. Using our theory of intermediate quantifiers, we obtain intension of this proposition as follows:

$$ \begin{aligned} \lambda \mathbf{{w}}_{\omega \varphi }\,&\cdot \,(\exists z_{o((\alpha \omega )\varphi )}(\pmb {\varDelta }(z\subseteq \mathbf{Swede}\,\, {\mathbf {w}})) \mathop {\pmb { \& }}\nolimits \nonumber \\&(\forall \mathbf{{h}}_{(\alpha \omega )\varphi }) (z\mathbf {h}\,\pmb {\Rightarrow }\, \mathop { Bi}\nolimits ({\mathbf {w}}v) (\mathbf {h}v({\mathbf {w}}v)))\mathop {\pmb { \& }}\nolimits \mathop { Bi}\nolimits \,\textit{Ve}((\mu (\mathbf{Swede}\,\,{\mathbf {w}})z)). \end{aligned}$$
(8.17)

Thus, interpretation of this formula in a model \({\mathcal {M}}\) is a function assigning to each (global) context \({\mathbf {w}}_{\omega \varphi }\) a truth value which is obtained as a minimum of the truth that the greatest fuzzy set \(z\) of tall Swedes in the context \({\mathbf {w}}\) is very big (in the sense of the measure \(\mu \)).

8.2.8 Verbs and Other Linguistic Phenomena

Verbs that are the most complicated units of natural language. They vary by type, and each type is determined by the kinds of words that follow it and the relationship those words have with the verb itself. There are six types of verbs: intransitive (to run how, to speak how), two kinds of transitive (to read what, to consider what), to-be verbs, linking (seem, become) and two-place transitive (to give whom what).

Verbs stay in the core of sentences and can have varying number of arguments (this is called valency) depending on the complexity of sentence. Furthermore, they are characterized by other features, namely by tense (present, future, past), modality (necessity, indicative, possibility), aspect (perfective, imperfective, continuous, etc.), direction of speech, gender and other ones. More about verbs can be found in an extensive literature (See, e.g., [6, 30, 57].)

The model of the meaning of verbs must cope with the problem of changing valency. This mathematically means that verbs behave as relations with changing arity. A possible model of the meaning of verbs can interpret them, for example, as a union of fuzzy sets of fuzzy relations of different arities that depends on time. Hence, we may construct intension of a verb as a function

$$ \mathop {\mathrm{Int}}\nolimits (\mathsf{verb}): W\times T\rightarrow \bigcup _{n=1}^K \mathcal {F}(\mathcal {F}(U^n)) $$

where \(T\) is time (we can put \(T=\mathbb {R}\)), \(W\) is a set of possible worlds (contexts), \(U\) a universe and \(K\) is a possible valency of the verb. The universe should consist of various kinds of elements that can be named by noun phrases. In syntax of FTT, we can express the meaning of verbs as a formula

$$ \lambda w\,\lambda t\, \bigvee _{n=1}^K A_{o(o\alpha ^n)} $$

where \(\alpha ^n= \underbrace{\alpha \cdots \alpha }_{n-\text {times}}\) and \(\vee \) is a disjunction.

Till now, no model of the meaning of verbs using the means of fuzzy logic was suggested. A detailed and careful elaboration in cooperation with linguists is needed.

There are also other phenomena that need to be captured by the model of linguistic semantics. Among them let us recall the topic–focus articulation and de dicto/de re usage. The topic–focus articulation is a phenomenon that extremely extends expressive power of natural language. Roughly speaking, each more complex linguistic expression can be divided into two parts: the topic that is the known part and focus, the new information. Each expression is thus ambiguous and the meaning of it can be clear only after specifying both parts. For example, “John goes to the cinema” says either that JOHN goes to the cinema (and not somebody else), or that John goes to the CINEMA (and not to the theater), etc.Footnote 11

The de dicto/de re distinction relates to the distinction about occupied office (e.g., that we speak about president—de dicto) or about Mr. Obama (de re). This problem seems to be well captured by the means of FTT since it enables us to distinguish between functions \(A_{\epsilon \epsilon }\) and objects \(B_{\epsilon }\) and manipulate with them (Cf. [7]).

8.3 Reasoning in FNL

Since FNL claims to be a logic of human reasoning, it must also suggest its model. Till now, two main inference models are available. The first is inference on the basis of linguistic description consisting of fuzzy/linguistic IF-THEN rules. The second are reasoning schemes with generalized quantifiers. The most elaborated part are intermediate syllogisms.

8.3.1 Fuzzy/Linguistic IF-THEN Rules

The theory of fuzzy IF-THEN rules is the most widely discussed and most powerful area of fuzzy logic, which has a wide variety of applications. Recall the general form of a fuzzy IF-THEN rule:

$$\begin{aligned} \mathsf{IF }\,\,X\text { is }\mathcal {A}\,\,\mathsf{ THEN }\,\,Y\text { is }\mathcal {B}, \end{aligned}$$
(8.18)

where \(X\text { is }\mathcal {A}, Y\text { is }\mathcal {B}\) are evaluative predications.Footnote 12 A typical example is

IF temperature is small  THEN the amount of gas is very big.

From the linguistic point of view, this is a simple conditional clause, i.e. a conditional sentence with a clear structure consisting of the antecedent and consequent.

In FNL, we call a (finite) set of rules (8.18) a linguistic description. The rules (8.18) (and the linguistic descriptions) apparently characterize some kind of relation between values of \(X\) and \(Y\). In the fuzzy set theory, these rules are usually construed as special fuzzy relations. This is purely extensional approach and the rules, in fact, are not treated as sentences of natural language. Let us remark that this method of interpretation of fuzzy IF-THEN rules is very convenient when we need a well working tool for approximation of functions but it is less convenient as a model of human reasoning. Therefore, it does not fit the paradigm of FNL.

In FNL, the rules (8.18) are taken as genuine conditional clauses of natural language and the linguistic description is taken as a text characterizing some situation, strategy of behavior, control of some process, etc. The goal of the constructed FNL model is to mimic the way how people understand natural language. Then, a formal theory of Ł-FTT is considered so that intension of each rule (8.18) can be constructed:

$$\begin{aligned} \mathop {\mathrm{Int}}\nolimits (\mathcal {R})\mathrel {:=}\,\lambda w\,\lambda w'\,\cdot \lambda x\,\lambda y\,\cdot \mathop { Ev}\nolimits ^A wx\,\pmb {\Rightarrow }\,\mathop { Ev}\nolimits ^C w'y \end{aligned}$$
(8.19)

where \(w, w'\) are contexts of the antecedent and consequent of (8.18), respectively, \(\mathop { Ev}\nolimits ^A\) is the intension of the predication in the antecedent and \(\mathop { Ev}\nolimits ^C\) the intension of the predication in the consequent. The linguistic description is interpreted as a set of intensions (8.19).Footnote 13 When considering a suitable model, we obtain a formal interpretation of (8.19) as a function that assigns to each pair of contexts \(w, w'\in W\) a fuzzy relation among objects.Footnote 14 It is important to realize that in this case, we introduce a consistent model of the context and provide a general rule for the construction of the extension in every context.

When a model of a linguistic description is given, we must be able to model human reasoning on the basis of it, given a perception in the form ‘\(X\text { is }\mathcal {A}_0\)’ where \(\mathcal {A}_0\) may differ slightly from all the \(\mathcal {A}_1, \ldots , \mathcal {A}_n\). The well elaborated method is perception-based logical deduction (See [36, 47].) whose main idea is to consider the linguistic description as a specific text, which has a topic (what we are speaking about) and focus (what is the new information).Footnote 15 Each rule is understood as local but vague information about the relation between \(X\) and \(Y\). The given predication ‘\(X\text { is }\mathcal {A}_0\)’ is taken as a perception of some specific value of \(X\). On this basis, the most proper rule from the linguistic description is applied (fires), and the best value of \(Y\) with respect to this rule is taken as a result. Hence, despite the vagueness of the rules forming the linguistic description, the procedure can distinguish among them

The rule of perception-based logical deduction can be formally expressed as

$$ r_{PbLD}: \frac{\textit{LPerc}^{\mathop { LD}\nolimits }(x_0, w)= \mathop {\mathrm{Int}}\nolimits (X\text { is }\mathcal {A}_{i_0}), \mathop { LD}\nolimits }{\mathop { Eval}\nolimits (\hat{y}_{i_0}, w', \mathcal {B}_{i_0})}, $$

where \(\mathop { LD}\nolimits \) is a linguistic description, and \(\hat{y}_{i_0}\) is the resulting best value of \(Y\), provided that the perception of \(x_0\) in the context \(w\) is the linguistic expression \(\mathcal {A}_{i_0}\) and the dependence between \(X\) and \(Y\) is locally characterized by \(\mathop { LD}\nolimits \).

The perception-based logical deduction is a powerful reasoning method that well models the way of human reasoning and has a lot of various kinds of applications (See [44]).

8.3.2 Syllogistic Reasoning

In this section, we will discuss syllogistic reasoning on the basis of sentences containing intermediate quantifiers. We suppose to deal with the formal theory \(T^{\text {IQ}}\) mentioned above.

By a valid syllogism we understand a tripleFootnote 16 of formulas \(\langle P_1, P_2, C \rangle \) such that

$$ T^{\text {IQ}}\vdash P_1\mathop {\pmb { \& }}\nolimits P_2\,\pmb {\Rightarrow }\,C $$

(equivalently, if \(T^{\text {IQ}}\vdash P_1\,\pmb {\Rightarrow }\,(P_2\,\pmb {\Rightarrow }\,C)\)). Note that, if a syllogism is valid then the inequality

$$\begin{aligned} {\mathcal {M}}(P_1)\otimes {\mathcal {M}}(P_2)\le {\mathcal {M}}(C) \end{aligned}$$
(8.20)

holds in every model \({\mathcal {M}}\,\models \,T^{\text {IQ}}\).

Let \(Q_1, Q_2, Q_3\) be intermediate quantifiers and \(X, Y, M \in \textit{Form}_{o\alpha }\) be formulas representing properties. Analogously as in classical logic, we will consider four figures of syllogisms:

figure b

Peterson in his book [55] demonstrated that there are 105 intermediate syllogisms that are valid. All these syllogisms contain the above introduced intermediate quantifiers. Validity of them is proved syntactically in FNL (See [32].) (in the formal theory \(T^{\text {IQ}}\)) that is a very strong result assuring us that the inequality (8.20) holds in every model. For example, below is the list of valid intermediate syllogisms containing the intermediate quantifiers almost all (P), few (B), most (T, D), and many (K, G):

Figure I

Figure II

Figure III

Figure IV

AAP

AEB

\(({}^*\!\mathbf {P}){\mathbf {A}}\mathbf {I}\)

AEB

APP

ABB

\({\mathbf {E}}({}^*\!\mathbf {P})\mathbf {O}\)

\(({}^*\!\mathbf {P}){\mathbf {A}}\mathbf {I}\)

APT

ABD

\(({}^*\!\mathbf {B}){\mathbf {A}}\mathbf {O}\)

\({\mathbf {E}}({}^*\!\mathbf {P})\mathbf {O}\)

APK

ABG

\({\mathbf {A}}({}^*\!\mathbf {P})\mathbf {I}\)

 

API

\({\mathbf {A}}({}^*\!\mathbf {B})\mathbf {O}\)

\(\mathbf {P}({}^*\!\mathbf {P})\mathbf {I}\)

 

EAB

EAB

\(\mathbf {T}({}^*\!\mathbf {P})\mathbf {I}\)

 

EPB

EPB

\(({}^*\!\mathbf {K})\mathbf {P}\mathbf {I}\)

 

EPD

EPD

\(({}^*\!\mathbf {P})\mathbf {T}\mathbf {I}\)

 

EPG

EPG

\(\mathbf {P}({}^*\!\mathbf {K})\mathbf {I}\)

 

\({\mathbf {E}}({}^*\!\mathbf {P})\mathbf {O}\)

\({\mathbf {E}}({}^*\!\mathbf {P})\mathbf {O}\)

\(\mathbf {B}({}^*\!\mathbf {P})\mathbf {O}\)

 
  

\(\mathbf {D}({}^*\!\mathbf {P})\mathbf {O}\)

 
  

\(\mathbf {G}({}^*\!\mathbf {P})\mathbf {O}\)

 
  

\(\mathbf {B}({}^*\!\mathbf {T})\mathbf {O}\)

 
  

\(\mathbf {B}({}^*\!\mathbf {K})\mathbf {O}\)

 

(the letters refer to the concrete quantifiers introduced introduced above and the asterisks denote quantifiers with presupposition of non-emptiness of the universe).

Examples of valid syllogisms are the following:

$$\begin{aligned} \mathbf{ATT }{\text {-I}}: \frac{\text {All women are well dressed Most people in the party are women}}{\text {Most people in the party are well dressed}} \end{aligned}$$
$$\begin{aligned} \mathbf{ETO }{\text {-II}}:\frac{\text {No lazy people pass exam Most students pass exam}}{\text { Some students are not lazy people}} \end{aligned}$$
$$\begin{aligned} \mathbf{PPI }{\text {-III}}:\frac{\text {Almost all old people are ill Almost all old people have gray hair}}{\text {Some people with gray hair are ill}} \end{aligned}$$
$$\begin{aligned} \mathbf{TAI }{\text {-IV}}:\frac{\text {Most shares with downward trend are from energy industry All shares of energy industry are important}}{\text {Some important shares have downward trend}} \end{aligned}$$

8.3.3 A Model of Commonsense Human Reasoning

Finally, we will also demonstrate the power of FNL in a more complex model of human reasoning. This was shown on an example of reasoning of a detective Lt. Columbo based on one episode from the famous TV series.Footnote 17 Let us emphasize that the presented method can be taken as a more general methodology that has a variety of other specific applications (Cf. [11]).

The story:

Mr. John Smith has been shot dead in his house. He was found by his friend, Mr. Robert Brown. Lt. Columbo suspects Mr. Brown to be the murderer.

Mr. Brown’s testimony:

I have started from my home at about 6:30, arrived at John’s house at about 7, found John dead and went immediately to the phone booth to call police. They came immediately.

Evidence of Lt. Columbo:

Mr. Smith had a high quality suit and a broken wristwatch stopped at 5:45. There was no evidence of a hard blow to his body. Lt. Columbo touched the engine of Mr. Brown’s car and found it to be more or less cold.

Lt. Columbo concluded that Mr. Brown lied because of the following:

  1. (i)

    Mr. Brown’s car engine is more or less cold, so he must have been waiting long (more than about 30 min). Therefore, he could not have arrived and called the police (who came immediately).

  2. (ii)

    A high quality wristwatch does not break after not too hard blow. A man having high quality dress and a luxurious house is supposed to also have a high quality wristwatch. The wristwatch of John Smith is of low quality, so it does not belong to him. It does not display the time of Mr. Smith’s death.

The reasoning of Lt. Columbo based on FNL is modeled by means of a combination of logical rules, world knowledge and evidence with the help of non-monotonic reasoning.

The world knowledge includes common sense knowledge of the context and further knowledge, which can be characterized using linguistic descriptions applied in specific context for the included variable, e.g., drive duration to heat the engine, temperature of engine, etc. The used linguistic descriptions are, e.g., the following:

  • Logical rules that are hereditary valid, for example

    $$\begin{aligned}&\mathsf{IF }\,X\text { is }\mathop { Sm}\nolimits _{\nu }\,\mathsf{ THEN }\,X\text { is }\pmb {\lnot }\mathop { Bi}\nolimits , \\&\mathsf{IF }\,X\text { is }\mathop { Bi}\nolimits _{\nu }\,\mathsf{ THEN }\,X\text { is }\pmb {\lnot }\mathop { Sm}\nolimits . \end{aligned}$$

    where \(\mathop { Sm}\nolimits , \mathop { Bi}\nolimits \) are linguistic expressions “small, big” and \(\nu \) is some linguistic hedge.

  • Common sense knowledge from physics:

    $$\begin{aligned}&\mathsf{ IF }\,{\textit{drive duration}}\text { is }\mathop { Bi}\nolimits \,\,\mathsf{ THEN }\,\, {\textit{engine temperature}}\text { is }\mathop { Bi}\nolimits ,\\&\mathsf{ IF }\,{\textit{drive duration}}\text { is }\mathop { Sm}\nolimits \,\,\mathsf{ THEN }\, {\textit{engine temperature}}\text { is }\textit{ML}\mathop { Sm}\nolimits ,\\&{\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots }\nonumber \end{aligned}$$
  • Common sense knowledge of customs of people:

    $$\begin{aligned}&\mathsf{ IF }\,\,{\text {quality of}}\,\,x{\text {'s suit is}}\,\,{\mathop { Bi}\nolimits }\,\,\mathsf{ AND }\,\, {\text {quality of}}\,\,x{\text {'s house is}}\,{\textit{Ve}\mathop { Bi}\nolimits }\nonumber \\&{}\mathsf{THEN }\,\,\text {wealth of }x\text { is }\mathop { Bi}\nolimits ,\\&{\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots }\nonumber \end{aligned}$$
  • Some other kinds of common sense knowledge, for example, properties of products, etc.

On the basis of a formal analysis which includes the use of the perception-based logical deduction, Lt. Columbo concludes that the two special constructed theories are contradictory. Since his perceptions and the evidence cannot be doubted, Mr. Brown is lying, so he had an opportunity to kill Mr. Smith. It is important to emphasize that the contradictory theories were constructed as nodes of a graph representing the structure of non-monotonic reasoning.

8.4 Conclusion

In this paper, we suggest to focus more deeply on the proclaimed ability of fuzzy sets—to enable to model the semantics of natural language expressions. Our idea is to develop a special branch of mathematical fuzzy logic that we call Fuzzy Natural Logic as a generalization of an older classical concept of Natural Logic. Its paradigm is to model natural human reasoning that is based on the use of natural language. Therefore, it is necessary to have also a model of semantics of natural language at disposal. We argue that FNL should be developed in a close cooperation with linguists and logicians.

We gave a brief overview of some units and phenomena of natural language and outlined problems connected with their semantics. In parallel, we also outlined ways how their semantics can be modeled inside FNL. In the second part, we also outlined some of possible human reasoning schemes that can be modeled using FNL.

One can see that still many problems and open questions have to be solved and answered before we can say that FNL is a well developed theory that reached its goal—to model natural human reasoning. Even at this stage of research, though, there are various interesting and quite well working applications of FNL. Let us mention few of them:

  • Identification of rock sequences on the basis of expert geologists’ knowledge [34].

  • Linguistic control of technological processes [41, 51].

  • Multi-criteria decision-making (without need to define weights of criteria) [48].

  • Forecasting of the trend-cycle of time series [50, 52] and linguistic evaluation of is trend (i.e., “steep increase/decrease, stagnating, rough increase/decrease”, etc.).