Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Some additional comments should be added, after the short previous glance at analogy in Chap. 9, that can lead to more questions which still remain open, especially to controlling analogy through degrees of the extent to which it is actually present, and provided such numerical control were possible.

15.1. To begin with, it should be remarked that, in ordinary reasoning, analogy is often expressed in plain language, sometimes with the help of some real or virtual figures; this implies the use, sometimes intensive, of imprecise words in an uncertain setting. Hence, representing analogy would require fuzzy sets, with which if some intervening concepts were precise, they could also be represented by membership discontinuous functions only taking the values 0 or 1.

Both for representing imprecision and controlling uncertainty it could be suitable to count with a “degree of analogy” between pairs of concepts. The values of such degree, varying between 0 and 1, should indicate the total dissimilarity of the two elements if their degree is null, and the total similarity if their degree is one. Provided the degree up to which “x is P is analogous or similar with y is Q” is a number S (μ P (x), μ Q (y)) ∈ [0, 1], which properties should verify the function S: [0, 1] x [0, 1] → [0, 1]?

The properties, S(a, a) = 1, and S(a, N(a)) = 0, if N is a negation function, undoubtedly seem to reflect properties of analogy; note that the value of S for a pair (μ P (x), μ P a(x)) is not necessarily null, but it could be different from 0 because P and P a can show some degree of analogy.

A typical analogy scheme,

$$ a:b::c:d \Leftrightarrow a\,{\text{is}}\,{\text{to}}\,b\,{\text{as}}\,c\,{\text{is}}\,{\text{to}}\,d, $$

should imply S(a, b) ≤ S(c, d), although the reciprocal cannot always be sustained, and even if an ε-approximation of the type “it exists a small number ε > 0, such that IS(a, b) − S(c, d)I ≤ ε”, also could be suitable.

What does not seem to be always adequate is presuming S(a, b) = S(b, a) for all a, b in [0, 1], because in general supposing “x is P” is analogous with “y is Q” does not mean that “y is Q” is analogous to “x is P”.

To establish a list of axioms for S is not easy; there is no single type of analogy. Anyway, in analogy the problem of breaking it also appears, such as that formerly mentioned with synonyms, and that admits an interpretation by means of the generalized idea on transitivity that follows.

It is said that function S is F-transitive for an operation F: [0, 1] x [0, 1] → [0, 1] provided,

$$ F\left( {S\left( {a,b} \right),S\left( {b,c} \right)} \right) \le S\left( {a,c} \right),\;{\text{for}}\,{\text{all}}\,a,b,{\text{and}}\,c\,{\text{in}}\,\left[ {0, 1} \right]. $$

Such definition translates into the degrees, the transitive law “a: b & b: c \( \Rightarrow\) a: c”, allowing a decreasing of degrees in chains of analogy eventually leading to the disappearance of the analogy between the first and last step in the chain. Which properties should be attributed to the operation F?

If it can be accepted that

$$ a:b\,\&\, b:c\,{\text{means}}\,{\text{the}}\,{\text{same}}\,{\text{as}}\,b:c\,\&\, a:b, $$

something not at all odd, the commutative property of F is beyond doubt, as they are beyond doubt that F(1, 0) = F(0, 1) = 0, once accepting S(a, a) = 1, S(a, N(a)) = 0, and F-transitivity; for instance, F(S(a, a), S(a, N(a)) ≤ S(a, N(a)) just show F(1, 0) = 0. It also seems beyond doubt that F(1, 1) = 1 once accepting S(a, a) = 1. To add F(0, 0) = 0 also does not seem rare at all provided it were accepted that “S(a, b) = S(b, c) = 0 \( \Rightarrow\) S(a, c) = 0”. Neither basic properties for S, nor those for F, can be easily fixed; it seems to be something of a contextual character.

In sum, let’s call S: [0, 1] x [0, 1] → [0, 1], an F-similitude function, or an index of (symmetrical) analogy, provided it were to verify

  • S(a, a) = 1, for all a in [0, 1],

  • S(a, b) = S(b, a), for all a, b in [0, 1], and

  • There exists a commutative operation F in [0, 1], verifying the border conditions F(0, 0) = F(0, 1) = 0, and F(1, 1) = 1, with which it is:

$$ F(S\left( {\mu \left( x \right),\sigma \left( x \right)} \right),S\left. {\left( {\sigma \left( x \right),\lambda \left( x \right)} \right)} \right) \le S\left( {\mu \left( x \right),\lambda \left( x \right)} \right), $$

for all μ, σ, λ in [0, 1]X, and x in X.

Of course, if more properties were added to S and F, more could follow from this definition. For instance, provided neutrality and monotony, F(a, 1) = 1, and a ≤ b \( \Rightarrow\) F(a, c) ≤ F(b, c) for all a, c in [0, 1], respectively, were added to F, then a ≤ 1 and c ≤ d would imply F(a, c) ≤ F(1, d) = d. Thus, provided it were ε ≤ a, and ε ≤ b, it would be

$$ F\left( {\varepsilon ,\varepsilon } \right) \le F\left( {a,b} \right),{\text{but}}\,{\text{not}}\,{\text{necessarily}}\;\,\varepsilon \le F\left( {a,b} \right), $$

unless it were ε ≤ F(ε, ε). Note that, although with some reservation for what concerns symmetry in analogy, it is presumed that S is symmetrical, and it means that the former definition of S cannot be considered too general, but only suitable for a symmetrical analogy.

Let’s show an example with F = W f , a t-norm in the Lukasiewicz family that, obviously, can be taken as a function F. If it is

$$ \varepsilon \le S\left( {\mu \left( x \right),\sigma \left( x \right)} \right),\quad{\text{and}}\quad\varepsilon \le S(\sigma \left( x \right),\left. {\lambda \left( x \right)} \right), $$

it follows that

$$ \begin{aligned} W_{f} \left( {\varepsilon , \, \varepsilon } \right) & = f^{ - 1} ({ \hbox{max} }(0,{ 2}f(\varepsilon ) - 1) \le W_{f} (S(\mu (x),\sigma (x),S(\sigma (x),\lambda(x))) \le S(\mu (x),\lambda(x))) \\ & \Rightarrow {\hbox{max} }(0,{ 2}f(\varepsilon) - 1) \le f(S(\mu (x),\lambda(x)))). \\ \end{aligned} $$

That is, provided 2f(ε) – 1 ≤ 0 ⇔ f(ε) ≤ 1/2 ⇔ ε ≤ f −1(1/2), nothing could be concluded, and only if 1/2 < f(ε) ⇔ f −1(1/2) < ε, would it follow that 0 < f −1(2f(ε) − 1) ≤ S(μ(x), λ(x)). Hence, under the hypotheses that μ(x) corresponds to “x is P”, σ(x) to “x is Q”, λ(x) to “x is R”, and that it is f −1(1/2) < ε, it can be established that if “x is P” is similar to “x is Q”, and this statement is similar to “x is R”, with respective degrees greater than ε, then the first statement is similar to the third with a degree greater than f −1(2f(ε) − 1)) > 0.

To keep ε ≤ S(μ(x), λ(x)), it suffices that

$$ \varepsilon \le f^{ - 1} \left( { 2f\left( \varepsilon \right) - 1} \right) \Leftrightarrow 1 \le f\left( \varepsilon \right) \Leftrightarrow 1 \le \varepsilon \;\;{\text{implying}}\;\;\varepsilon = 1, $$

and the three statements are fully similar. Hence, if, as usual, 0 < ε < 1, it would be 0 < f −1(2f(ε) − 1) < ε, showing that the degree of similarity between the first and third statements actually decreased; consequently, where ε ∈ (0, 1), a threshold of analogy, that is, a level under which analogy ceases to be preserved, the analogy would be lost by not surpassing such a threshold. It can happen in three steps as in the example, or in more steps, but the decreasing values show that arriving at a step not keeping analogy with the first can be assured.

What happens with F = prod? With F = prod, ε 2 ≤ S(μ(x), λ(x)) is obtained, and because for 0 < ε < 1 it is 0 < ε 2 < ε, the same conclusion follows.

For not having decreasing degrees, it should be F(ε, ε) = ε, and this is only possible (with continuous t-norms) either if F = min, or F is an ordinal sum counting ε among its idempotent elements. Anyway, with these last t-norms the chains of analogous statements will not break, and this seems to be something actually rare in analogy, as it is with synonymy that can be seen as a linguistic phenomenon of graded analogy of meaning. Among t-norms only those in the family of W seem suitable for modeling analogy’s breaks.

15.2. In artificial intelligence, and namely in case-based reasoning where reasoning is conducted by analogy, it is important to control analogy by means of a threshold. For instance, there is a system trying to mechanize geometrical reasoning by analogy, in which the function S is given by:

$$ S\left( {a,b} \right) = \sum {\hbox{min} } \left( {a_{i} ,b_{i} } \right)/\hbox{max} \left( {\sum {a_{i} } ,\sum {b_{i} } } \right) \in \left[ {0, 1} \right], $$

provided both figures a and b were characterized by the same attributes A 1, …, A n , but satisfying each one with degrees a i and b i in [0, 1] (1 ≤ i ≤ n), respectively. As it is easy to see, it is,

  • \( S\left( {a,b} \right) = 1 \Leftrightarrow a_{i} = b_{i} \quad {\text{for}}\,{\text{all}}\,i \in \left\{ {1,2, \ldots ,n} \right\} \);

  • \( S\left( {a,b} \right) = 0 \Leftrightarrow \hbox{min} \left( {a_{i} ,b_{i} } \right) = 0,\quad {\text{for}}\,{\text{all}}\,I \);

  • Provided S were F-transitive, F(a, b) = 0 should imply a + b ≤ 1, and, provided it were F ≤ W, then S is F-transitive.

Hence, by constraining F to be a continuous t-norm, it should be a t-norm W f , smaller than W, and neither min, nor prod, make SF-transitive.

In praxis, by trial and error, a threshold of analogy was empirically found equal to 0.7 with which the system works well. Note that for approaching the former low bound 0 < f −1(2f(ε) − 1) to 0.7, it suffices to take the order automorphism f(x= x 2, with which it is W f   W, and

$$ {\hbox{max} \left( {0,2\varepsilon^{2} - 1} \right)^{1/2} } > 0 \Leftrightarrow \sqrt {\left( {1/2} \right)} < \varepsilon . $$

Because √(1/2) ~ 0.707107, it suffices to take 0.708 ≤ ε as the threshold; as the builders of the system only appreciate up to the first decimal digit, they just took 0.7 instead of 0.708. W f -transitivity helps to foresee a lower bound for the threshold of analogy.

Formulae such as the former are important for the goal of “controlling” analogy. In fact, analogy often depends on the attributes that are taken for considering it; for instance, as said, if the only considered attribute is “spherical form”, oranges and apples could be seen as similar, but as soon as more attributes such as color, taste, smell, and so on, are considered, such similarity would cease. Analogy is often used, as it is in case-based reasoning, to substitute an element a by another b seen as similar, and on which is known more than is known on a; for instance, in the former example and when S(a, b) > 0.7, the properties shown by figure a are presumed for figure b. This permits, for reasoning on b, to guess that it enjoys the properties of a which are under consideration.

The importance of controlling analogy directly comes from the necessity of not seeing “a like b” when the degree of analogy between a and b is too low, in short and joking, for not eating apples instead of oranges, or oranges instead of apples. It is for this reason that, once an index of similarity or analogy S is known, it is important to fix a threshold of analogy; only two objects a, b such that the index S(a, b) is greater than the threshold, could be interchanged for reasoning. On the contrary, not doing it is foolish and can actually lead to (saying it metaphorically) confounding melons and footballs.

Hence, analogy deserves to be controlled and, in such respect, it is an open problem to know which indexes of analogy are, actually, measures of the contextual meaning of the word “analogous”, for which the empirical relationship “less analogous than” should be previously captured.

15.3. Reasoning by using analogy for conjecturing or for refuting is not always controlled by means of numerical indexes such as, for instance, is not done in commonsense reasoning where numerical comparisons are not usual. Anyway, it does not exclude some kind of verbal or linguistic control by, for instance, attending to the diverse attributes in play and limiting the analogy to them; it is the case when saying that John and his sister Anne have the “same eyes as their mother”. In these cases, it is obvious that it is not said that John and Anne are fully similar, but just that they are physically similar with respect to the attribute “eyes”; they hold no similarity with respect to mouth, ears, hair color, hands, and so on, and, provided the number of these attributes were 10, it could be said that they have 1/10 = 0.1 as a similarity index, a value that is very far from the value 1 of full similarity. It is more difficult to establish an index when considering not-physical characteristics of John and Anne, such as social behavior, gesticulating, and the like, or something else that can also be appreciated such as, “John and Anne argue like their parents did.” A very simple index of analogy is the ratio “positive attributes/total number of attributes”, that, nevertheless, does not take into account the degrees to which attributes are satisfied. Anyway, analogy should be always controlled; numerically if possible.

Analogy is still full of mystery concerning the establishment of a suitable mathematical framework for its formal study, that is, for its scientific domestication; but what is beyond doubt is that the analogy between two objects, images, persons, and the like, is rarely considered an identity even if identity can be seen as a particular case of analogy. Analogy always refers to some peculiarities of what is considered, but not always to its “totality”, something that often enough is only metaphysical. For instance, the John that currently is 65 years old, appearing in a photo of him taken 50 years ago, is different but analogous, similar, to the current John, this to the extreme of saying that both are the same John. Not all characteristics of the second John are taken into account, but only some of them.

Also for instance, in the framework of the curves given by quadratic equations, it can be said that a circle and an ellipse whose focal points are close are similar figures, even if they are not identical and nobody with some elemental knowledge of mathematics will confuse circles with ellipses, such as balls with melons; in the end, a circle is but a particular case of ellipse when the two focal points coincide. Anyway, in a problem of graphical design, it is perfectly conceivable to see both figures as analogous with the aim of graphically translating to the circle some property of the ellipse; even in plane geometry there are problems whose solutions are well conjectured after a graphical approximate reasoning with analogous figures.

15.4. A short additional reflection is still in order. In the same vein that circles and ellipses cannot be confused, as well as the John of 50 years ago is not physically the same as today’s John, for a rough reasoning in commonsense reasoning, analogy usually forgives some attributes and just concentrates on others, but, for doing reasonings such as those of science, analogy should be refined.

Beyond formal frames, identity is often a kind of illusion; there is no more identity than a full coincidence in all the imaginable orders, that is, a is identical to b if and only if a were to coincide fully with b, that is, if a could be substituted by b everyplace; it is the old Leibniz “identity of indiscernibles”, when two real objects in the world, or two virtual objects in the brain, are not identical, are just different, and analogy is but a form of graduating differences in such a way that when the degree is one are taken as identical, and only are fully different when the degree is zero. In this sense, the symbols = and ≠ are but particular exceptions of the symbol ~ denoting analogy, and the human senses seem to count with an innate possibility for not only grasping difference, but refining it to analogy.

Surely, without such intellectual capability, the evolution of mankind would be another and different. This capability is possibly already inserted in the brain, with repercussions on reasoning, where analogy should be seen, as said before, as something essential, for instance, for passing from the perception of “big tree”, to “big mountain”, to “big number”, and to “big money”. From perceiving that stones are of different sizes, to see that different stones can be (imprecisely, of course) classified in several subtypes, such as small, big, and middle, only with time, and when the ideas of weighting and its measuring appeared and numerical degrees established, did the idea of numerical degree of similitude appear.

15.5. To end this chapter, let’s remark that analogy between things, or between physical situations, passed to analogy between virtual objects and situations, passed to fiction, and to metaphorical stories such as those for children and, finally, to intellectual tasks such as writing novels, philosophy, and also to scientific research. The case of the falling lift of Einstein, the falling apple of Newton, or the ouroboros of Kekulé, are but examples of analogies suggesting a conjecture that, in science, is not only for telling it, but that it should, first of all, be proven by some reasoning perhaps supported by an F-similitude function.

In the same vein, when the classical logical calculus was generalized to the fuzzy one, and the universal connectives and, or, not, and if/then, passed to operations that should be specified at each particular context, the difference suggested that, in large statements, or in systems of many rules, linguistic connectives expressed by the same word deserve to be specified by different operations at each of the statement parts depending on their respective meanings. For instance, in the statement “John and Anne bought a new house, and thought of either repainting all of it, or modernizing the kitchen,” the first “and” seems to be commutative, but the second seems to be not; hence, if translating it into fuzzy terms, the first could be represented by a t-norm, but to represent the second a noncommutative operation should be chosen.

15.6. Another analogy that can give birth to fuzzy sets came from comparing the {0, 1}-valued characteristic function of a crisp set with the truth values of precise statements and then, by analogy, enlarging this set of values to the full unit interval [0, 1] and allowing the “new sets” corresponding to imprecise statements to be characterized by a function ranging in the unit interval. This is a view not supposing that fuzzy sets are a simple generalization of crisp sets, but obtained by some kind of aggregation of them. It is a presumption that, by enlarging the analogy, does not lead to considering only the basic connectives min, max, and 1 − id, in particular to believe that “p and p”, and “p or p” always mean p.

For instance, if, in the universe X = {1, 2, 3, 4, 5}, the sets A = {1, 2, 3} and B = {3, 4, 5}, are “operated” (through their respective characteristic functions) with the arithmetic mean M(x, y) = (x + y)/2, what results is

$$ 1/0.5 + 2/0.5 + 3/1 + 4/0.5 + 5/0.5 $$

that, represented by M(A, B= (A + B)/2, could suggest, by analogy, to consider all the entities obtained by applying aggregation functions to sets, such as all the means, as fuzzy sets. Also, for instance, the noncommutative pondered mean N(x, y= (x + 3y)/4 leads to

$$ N\left( {A,B} \right) = \left( {A + 3B} \right)/4 = 1/0.25 + 2/0.25 + 3/1 + 4/0.75 + 5/0.75. $$

These examples serve to pose the question of which membership functions can come from aggregating a finite number of crisp sets, and to observe that commutativity cannot always hold with them because, for instance, it is M(A, B= M(B, A), but N(A, B N(B, A).

Although not all fuzzy sets can come from aggregating a finite number of crisp sets, it does not mean that given a fuzzy set μ on X there is not a family of aggregation functions {A x ; x ∈ X}, each giving a value of the membership function μ(x= A x (Ch(x)) for some numerical characteristic Ch(x) of each x such as in the limiting case of a single aggregation, with the formers M(A, B), and N(A, B). At the end, it is similar to what was formerly said on obtaining the numerical values μ(x) by a statistical methodology.