1 Introduction

Analogical reasoning has been regarded for a long time as a fruitful, creative way of drawing conclusions, or of explaining states of fact, even if this form of reasoning does not present the guarantees of validity offered by deductive reasoning. As such, it has been extensively studied in particular by philosophers, psychologists and computer scientists. We can cite for instance [12, 14, 15, 34] where the power of analogical reasoning is emphasized. See also [28] for a computationally oriented survey of current trends.

Roughly speaking, the idea of analogy is to establish a parallel between two situations [11, 36] on the basis of which, one tentatively concludes that what is true in the first situation may also be true in the second one. When the two situations refer to apparently unrelated domains, the parallel may be especially rich. Just think of the well known example of the Bohrā€™s model of atom where electrons circle around the kernel, which is analogically linked to the model of planets running around the sun.

Closely related to analogical reasoning is the idea of analogical proportions, i.e., statements of the form ā€œa is to b as c is to dā€, which dates back to Aristotle (in Western world). This establishes a parallel, here between the pair (a,Ā b) and the pair (c,Ā d) [13]. Case-based reasoning seems also to obey a reasoning pattern of the same kind. Indeed it establishes a collection of parallels between known cases referring to a pair \((<problem_i>,\) \(<solution_i>)\) and a new \(<problem_0>\), for which one may think of a \(<solution_0>\) that is all the more similar to \(<solution_i>\) as \(<problem_0>\) is similar to \(<problem_i>\). This suggests that case-based reasoning is a particular instance of analogical reasoning. Still we shall see that analogical proportion-based inference significantly departs from case-based reasoning.

This survey paper is structured as follows. First the notion of logical proportion is recalled in Sect.Ā 2 before focusing on analogical proportion in Sect.Ā 3 and other homogeneous proportions in Sect.Ā 4, in a Boolean setting. The extension of analogical proportions to multiple-valued settings is briefly presented in Sect.Ā 5. The analogical inference machinery is discussed in Sect.Ā 6, applications are reviewed in Sect.Ā 7; relations and differences with case-based reasoning are addressed in Sect.Ā 8.

2 Boolean Logical Proportions

Proportions in mathematics state the identity of relations between two ordered pairs of entities, say (a,Ā b) and (c,Ā d). Thus, the geometric proportion corresponds to the equality of two ratios, i.e., \(a/b= c/d\), while the arithmetic proportion compares two pairs of numbers in terms of their differences, i.e., \(a - b =c - d\). In these equalities, which emphasize the symmetric role of the pairs (a,Ā b) and (c,Ā d), geometric or arithmetic ratios have an implicit comparative flavor, and the proportions express the invariance of the ratios. Note that by cross-product for geometric proportion, or by addition for the arithmetic one, the two proportions are respectively equivalent to \(ad = bc\) and to \(a + d = b + c\), which makes clear that b and c, or a and d, can be permuted without changing the validity of the proportion. Moreover, mathematical proportions are at the basis of reasoning procedures that enable us to ā€œextrapolateā€ the fourth value knowing three of the four quantities. Indeed, assuming that d is unknown, one can deduce \(d = c \times b/a\) in the first case, which corresponds to the well-known ā€œrule of threeā€, or \(d = c + (b - a)\) in the second case. Besides, continuous proportions where \(b = c\) are directly related to the idea of averaging, since taking \(b = c\) as the unknown respectively yields the geometric mean \((ad)^{1/2}\) and the arithmetic mean \((a\, +\, d)/2\).

Generally speaking, the idea of proportion is a matter of comparison of comparisons, as suggested by the statement ā€œa is to b as c is to dā€. In the Boolean setting there are four comparison indicators. On the one hand there are two similarity indicators, namely a positive one \(a \wedge b\) and a negative one \(\lnot a \wedge \lnot b\), and on the other hand two dissimilarity indicators \(\lnot a \wedge b\) and \(a \wedge \lnot b\). Logical proportions [27, 29] connect four Boolean variables through a conjunction of two equivalences between similarity or dissimilarity indicators pertaining respectively to two pairs (a,Ā b) and (c,Ā d). More formally

Definition 1

A logical proportion T(a,Ā b,Ā c,Ā d) is the conjunction of two equivalences between indicators for (a,Ā b) on one side and indicators for (c,Ā d) on the other side.

For instance, \(((a \wedge \lnot b) \equiv (c \wedge \lnot d)) \wedge ((a \wedge b) \equiv (c \wedge d))\) is a logical proportion, expressing that ā€œa differs from b as c differs from dā€ and that ā€œa is similar to b as c is similar to dā€. It has been established that there are 120 syntactically and semantically distinct logical equivalences. All these proportions share a remarkable property: they are true for exactly 6 patterns of values of abcd among \(2^4\) possible values. Thus, the above example is true for 0000, 1111, 1010, 0101, 0001, and 0100. The interested reader is referred to [27, 29] for thorough studies of the different types of logical proportions. In the following, we only consider those satisfying the code independent property. This property expresses that there should be no distinction when encoding information positively or negatively. In other words, encoding truth (resp. falsity) with 1 or with 0 (resp. with 0 and 1) is just a matter of convention, and should not impact the final result. Thus we should have the following entailment between the two logical expressions T(a,Ā b,Ā c,Ā d) and \(T(\lnot a,\lnot b,\lnot c,\lnot d)\), i.e., \(T(a,b,c,d) \Rightarrow T(\lnot a,\lnot b,\lnot c,\lnot d)\).

It has been established [27] that there only exist eight logical proportions that satisfy the above property. Indeed from a structural viewpoint, note that a proportion is built up with a pair of equivalences between indicators chosen among \(4\times 4 =16\) equivalences. So, to ensure code independency, the only way to proceed is to first choose an equivalence then to pair it with its counterpart where every literal is negated: for instance \(a \wedge b \equiv \lnot c\wedge d\) should be paired with \(\lnot a \wedge \lnot b \equiv c\wedge \lnot d\) in order to get a code independent proportion. This simple reasoning shows that we have indeed 16/2 = 8 possibilities.

The 8 code independent proportions split into 4 homogeneous proportions that are symmetrical (one can exchange (a,Ā b) with (c,Ā d)) and 4 heterogeneous ones that are not symmetrical. Homogeneity here refers to the fact that in the expression of the proportions, both equivalences link indicators of the same kind (similarity or dissimilarity), while in the case of heterogeneous proportions they link indicators of opposite kinds. This explains why there are four homogeneous and four heterogeneous logical proportions. In the following section, we focus our attention on one especially remarkable code independent logical proportion, the analogical proportion, reviewing the 7 others in the next section. Note also that the first example of logical proportion given above after DefinitionĀ 1 is symmetrical, but not code independent.

3 Boolean Analogical Proportion

The analogical proportion ā€œa is to b as c is to dā€ more formally states that a differs from b as c differs from d and b differs from a as d differs from cā€. This is logically expressed as [24] by the quaternary connective \(\text {Ana}\):

$$\begin{aligned} \text {Ana}(a, b, c,d) \triangleq ((a \wedge \lnot b) \equiv (c \wedge \lnot d)) \wedge ((\lnot a \wedge b) \equiv (\lnot c \wedge d)) \end{aligned}$$
(1)

Note that this logical expression of an analogical proportion only uses dissimilarity indicators, and does not mix a dissimilarity indicator and a similarity indicator as in the first example of logical expression we gave. In some sense Analogy is first a matter of (controlled) dissimilarity. TableĀ 1 exhibits the 6 patterns for which \(\text {Ana}(a, b, c,d)\), also traditionally denoted \(a:b\,::\,c : d\), is true. It can be easily checked on this table that the analogical proportion is indeed independent with respect to the positive or negative encoding of properties. Moreover, one can also see that the logical expression of \(a:b\,::\,c : d\) satisfies the key properties of an analogical proportion, namely

  • reflexivity: \(a:b\,::\,a:b\)

  • symmetry: \(a:b\,::\,c : d \Rightarrow c:d\,::\,a : b\)

  • central permutation: \(a:b\,::\,c : d \Rightarrow a:c\,::\,b : d\)

Table 1. Boolean patterns making Analogy true

Consequently it also satisfies \(a:a\,::\,b : b\), and the external permutation \(a:b\,::\,c : d\) \(\Rightarrow d:b\,::\,c : a\). Note also that these properties clearly hold for numerical proportions (\(\frac{a}{b}=\frac{a}{b}\); \(\frac{a}{b}=\frac{c}{d}\Rightarrow \frac{c}{d}=\frac{a}{b}\), and \(\frac{a}{b}=\frac{c}{d}\Rightarrow \frac{a}{c}=\frac{b}{d}\)). TableĀ 1 is not the only Boolean model satisfying the three above postulates, but it is the minimal one. See [32] for this result and also for the justification of the 6 patterns in TableĀ 1 in terms of (minimal) Kolmogorov complexity. Moreover, with this definition, the analogical proportion is transitive in the following sense:

$$(a : b\,::\,c : d) \wedge (c : d\,::\,e : f) \Rightarrow a : b\,::\,e : f$$

Besides, note also that analogical proportion holds for the three following generic patterns: \(s:s\,::\,s : s\), \(s : s\,::\,t : t\) and \(s : t\,::\,s : t\) where s and t are distinct values, which is the basis for the extension of the definition of the analogical proportion to nominal values. The above Boolean logic view of analogical proportion agrees with other previous proposals aiming at formalizing the idea of analogical proportion in various algebraic settings (including set-theoretic definitions of analogical proportions) [19, 22, 35]; see [24, 27] for details. Moreover, it is also worth noticing that the constraint \(a - b =c - d\) defining arithmetic proportions, when restricted to \(\{0,1\}\), validates the same 6 patterns as in TableĀ 1, although \(a - b \in \{-1, 0,1\}\) in this case. This arithmetic proportion view of analogical proportion is the one advocated by [33] between numerical vectors representing words in high-dimensional spaces.

Representing objects with a single Boolean value is not generally sufficient and we have to consider situations where items are represented by vectors of Boolean values, each component being the value of a binary attribute. A simple extension of the previous definition to Boolean vectors in \(\mathbb {B}^n\) of the form \(\varvec{a}\!=\!(a_1, ..., a_n)\) can be done as follows:

$$\varvec{a} : \varvec{b}\,::\,\varvec{c} : \varvec{d} \text{ iff } \forall i \in [1,n], a_i:b_i\,::\,c_i : d_i$$

Obviously, all the basic properties (symmetry, central permutation) still hold for vectors. In that respect it is important to notice that the four vectors are of the same nature, since they refer to the same set of features. Then symmetry just means that comparing the results of the comparisons of the two vectors inside each pair of vectors \((\varvec{a},\varvec{b})\) and \((\varvec{c},\varvec{d})\) does not depend on the ordering of the two pairs. Thus the repeated applications of symmetry followed by central permutation yield 8 equivalent forms of the analogical proportion: \((\varvec{a} : \varvec{b}\,::\,\varvec{c} : \varvec{d}) \!=\! (\varvec{c} : \varvec{d}\,::\,\varvec{a} : \varvec{b})\! =\! (\varvec{c} : \varvec{a}\,::\,\varvec{d} : \varvec{b}) \!=\! (\varvec{d} : \varvec{b}\,::\,\varvec{c} : \varvec{a})\) \(= (\varvec{d} : \varvec{c}v\varvec{b} : \varvec{a}) = (\varvec{b} : \varvec{a}\,::\,\varvec{d} : \varvec{c}) = (\varvec{b} : \varvec{d}\,::\,\varvec{a} : \varvec{c})= (\varvec{a} : \varvec{c}\,::\,\varvec{b} : \varvec{d})\). TableĀ 2 pictures the situation, where the components of the vectors have been suitably reordered in such a way that the attributes for which one of the 6 patterns characterizing the analogical proportion is observed, have been gathered, e.g., attributes \(\mathcal {A}_1\) to \(\mathcal {A}_{i-1}\) exhibits the pattern 1111. In the general case, some of the patterns may be absent.

Table 2. Pairing pairs (a,Ā b) and (c,Ā d)

This table shows that building the analogical proportion \(\varvec{a}: \varvec{b}\,::\,\varvec{c}: \varvec{d}\) is a matter of pairing the pair \((\varvec{a}, \varvec{b})\) with the pair \((\varvec{c}, \varvec{d})\). More precisely, on attributes \( \mathcal {A}_1\) to \(\mathcal {A}_{j-1}\), the four vectors are equal; on attributes \( \mathcal {A}_j\) to \(\mathcal {A}_{r-1}\), \(\varvec{a}= \varvec{b}\) and \(\varvec{c}= \varvec{d}\), but \((\varvec{a}, \varvec{b}) \ne (\varvec{c}, \varvec{d})\). In other words, on attributes \( \mathcal {A}_1\) to \(\mathcal {A}_{r-1}\) \(\varvec{a}\) and \( \varvec{b}\) agree and \(\varvec{c}\) and \( \varvec{d}\) agree as well. This contrasts with attributes \( \mathcal {A}_r\) to \( \mathcal {A}_n\), for which we can see that \(\varvec{a}\) differs from \(\varvec{b}\) as \(\varvec{c}\) differs from \(\varvec{d}\) (and vice-versa). We recognize the meaning of the formal definition of the analogical proportion.

Table 3. Pairing (a,Ā d) and (b,Ā c)

Let us now pair the vectors differently, namely considering pair \((\varvec{a}, \varvec{d})\) and pair \((\varvec{b}, \varvec{c})\), as in TableĀ 3. First, we can see that \(\varvec{a}: \varvec{d}\,::\,\varvec{b}: \varvec{c}\) does not hold due to attributes \( \mathcal {A}_s\) to \( \mathcal {A}_n\). Obviously, we continue to have \(\varvec{a}= \varvec{b}=\varvec{c}= \varvec{d}\) for attributes \( \mathcal {A}_1\) to \(\mathcal {A}_{j-1}\), while on the rest of the attributes the values inside each pair differ (in four different ways). Then the following definition of the analogical proportion [24], logically equivalent to Eq.Ā 1, should not come as a surprise:

$$\begin{aligned} a : b\,::\,c : d=((a \wedge d) \equiv (b \wedge c)) \wedge ((\lnot a \wedge \lnot d) \equiv (\lnot b\wedge \lnot c)) \end{aligned}$$
(2)

or equivalently

$$\begin{aligned} a : b\,::\,c : d=((a \wedge d) \equiv (b \wedge c)) \wedge ((a \vee d) \equiv (b \vee c)) \end{aligned}$$
(3)

Expression (3) can be viewed as the logical counterpart of a well-known property of geometrical proportions: the product of the means is equal to the product of the extremes. Interestingly enough, Piaget [25] pp. 35ā€“37) named logical proportion any logical expression between four propositional formulas a,Ā b,Ā c,Ā d for which (3) is true. Apparently, and strangely enough, Piaget never related this expression to the idea of analogy.

4 Seven Other Remarkable Logical Proportions

As said in Sect.Ā 2, there are 7 other code independent logical proportions. We start with two of them that are closely related to analogical proportion, before considering the last of the 4 homogeneous proportions, and finally the four heterogeneous proportions.

4.1 Two Proportions Associated with Analogy

2 other homogeneous logical proportions are closely related to analogical proportion:

  • reverse analogy: \(\text {Rev}(a, b, c,d) \triangleq ((\lnot a \wedge b) \equiv (c \wedge \lnot d)) \wedge ((a \wedge \lnot b) \equiv (\lnot c \wedge d)) \)

    It reverses analogy into ā€œb is to a as c is to dā€. In fact \(\text {Rev}(a, b, c,d)\!\Leftrightarrow \!\text {Ana}(b, a, c,d)\).

  • paralogy: \(\text {Par}(a, b, c,d) \triangleq ((a \wedge b) \equiv (c \wedge d)) \wedge ((\lnot a \wedge \lnot b) \equiv (\lnot c \wedge \lnot d))\)

    It expresses that what a and b have in common (positively or negatively), c and d have it also, and conversely. Up to a permutation, we recognize an expression similar to the expression (2) of \(a:b\,::\,c : d\). It can be checked that \(\text {Par}(a, b, c, d))\Leftrightarrow \text {Ana}(c, b, a, d)\).

In columns 2 and 3 of TableĀ 4, we give the 6 patterns that make true \(\text {Rev}\) and \(\text {Par}\), together with \(\text {Ana}\) in column 1.

Table 4. Boolean patterns making true Analogy, Reverse analogy, Paralogy, Inverse paralogy

A geometric illustration of the three proportions \(\text {Ana}\), \(\text {Par}\), \(\text {Rev}\) can be given. Indeed given 3 points \(\varvec{a}\), \(\varvec{b}\), \(\varvec{c}\) of the real plan \(\mathbb {R}^2\), one can always find a point \(\varvec{d}\) such that abdc is a parallelogram (see Fig.Ā 1). In fact, from 3 non aligned points, one can build 3 distinct parallelograms. See Fig.Ā 1 where the index of d refers to the proportion that generates it from \((\varvec{a}, \varvec{b}, \varvec{c}\)). In Fig.Ā 1, we have used different types of lines (with different width, dotted or not, arrows or not) to try to help visualizing the 3 parallelograms. Indeed if \((\varvec{a}, \varvec{b}, \varvec{c}\) are respectively represented by coordinates (0,Ā 0), (0,Ā 1), and (1,Ā 0), \(d_A\) is (1,Ā 1) and \((0,0) : (0,1)\,::\,(1, 0) : (1, 1)\) holds componentwise. It can be seen that geometrically, in \(\mathbb {R}^2\), this corresponds to the equality of vectors \(\overrightarrow{{ \varvec{a}}{ \varvec{b}}}\) and \(\overrightarrow{{ \varvec{c}} { \varvec{d}_A}}\). Applying the permutations linking \(\text {Ana}\), \(\text {Par}\), and \(\text {Rev}\), we observe that \(\overrightarrow{{ \varvec{a}}{ \varvec{b}}} = \overrightarrow{{ \varvec{d}_R} { \varvec{c}}}\) and \(\overrightarrow{{ \varvec{a}}{ \varvec{d}_P}} = \overrightarrow{{ \varvec{c}} { \varvec{b}}}\) as expected since \({ \varvec{a}}:{ \varvec{b}}v { \varvec{d}_R}: { \varvec{c}}\) iff \({ \varvec{a}},{ \varvec{b}}, { \varvec{c}},{ \varvec{d}_R}\) make a reverse analogy, while \({ \varvec{a}}:{ \varvec{d}_P}\,::\,{ \varvec{c}} : { \varvec{b}}\) iff \({ \varvec{a}},{ \varvec{b}}, { \varvec{c}},{ \varvec{d}_P}\) make a paralogy. Moreover, if we remember that \({ \varvec{a}}:{ \varvec{b}}\,::\,{ \varvec{c}}:{ \varvec{d}_A}\) holds if and only if \({\varvec{a}} - {\varvec{b}} = {\varvec{c}} - {\varvec{d}_A}\) componentwise, which is equivalent in \(\mathbb {R}\) to \({\varvec{d}_A}= - {\varvec{a}} + {\varvec{b}} + {\varvec{c}}\), we can easily deduce using permutations that \({\varvec{d}_R}= {\varvec{a}} - {\varvec{b}} + {\varvec{c}}\) and \({\varvec{d}_P}= {\varvec{a}} + {\varvec{b}} - {\varvec{c}}\). It yields \({\varvec{d}_R} =(1,-1)\) and \({\varvec{d}_P}=(-1,1)\), which indeed corresponds to the coordinates of \({\varvec{d}_R}\) and \({\varvec{d}_P}\) in \(\mathbb {R}^2\).

Fig. 1.
figure 1

Three parallelograms

4.2 Inverse Paralogy

By switching the positive and the negative similarity indicators pertaining to the pair (c,Ā d) in the definition of the paralogy, we obtain a new homogeneous logical proportion called inverse paralogy. Namely its expression is given by

$$\text {Inv}(a, b, c,d)\triangleq ((a \wedge b) \equiv (\lnot c \wedge \lnot d)) \wedge ((\lnot a \wedge \lnot b) \equiv (c \wedge d))$$

\(\text {Inv}(a, b, c,d)\) states that ā€œwhat a and b have in common, c and d do not have it and converselyā€. This expresses a kind of ā€œorthogonalityā€ between the pairs (a,Ā b) and (c,Ā d). \(\text {Inv}(a, b, c,d)\) is clearly symmetrical and code independent. It can be shown [27] that \(\text {Inv}\) is the unique logical proportion (among the 120ā€™s!) which remains unchanged under any permutation of two terms among the four. Namely \(\text {Inv}(a, b, c,d)\Leftrightarrow \text {Inv}(b, a, c,d)\Leftrightarrow \text {Inv}(a, c, b,d)\Leftrightarrow \text {Inv}(c, b, a,d)\) (the other permutations of two terms are obtained by symmetry). The patterns that make true \(\text {Inv}\) are given in the column 4 of TableĀ 4. As can be seen, it is true for the patterns encountered in the truth tables of \(\text {Ana}\), \(\text {Par}\), \(\text {Rev}\), except 0000 and 1111.

Note also in TableĀ 4 that the 6 patterns that make the four proportions true belong to a set of 8 patterns. This set of 8 patterns is characterized by the logical formula \((a\equiv b)\equiv (c \equiv d)\), which corresponds to an analogical-like connective proposed by Klein [17], in relation with anthropological materials. TableĀ 5 exhibits the pair of patterns that each proportion misses among the 8 patterns. It also shows what may be called the ā€œcharacteristicā€ patterns of a proportion, namely the ones making one of the two involved indicators true for (a,Ā b) and one true for (c,Ā d). Those latter patterns correspond to the reading of the proportion.

Table 5. Analogy, Reverse analogy, Paralogy, Inverse paralogy: Characteristic/missing patterns

An illustration of the use of the inverse paralogy is provided by Bongard problems [3] that are visual puzzles where we have two sets A and B of relatively simple pictures. All the pictures in set A have a common feature, which is lacking in all the ones in set B. The problem is to find the common feature. This corresponds to one of the characteristic patterns of \(\text {Inv}\) [30].

4.3 The 4 Heterogeneous Proportions

The 4 heterogeneous proportions are obtained by putting the \(\equiv \) connectives between indicators of different kinds for (a,Ā b) and for (c,Ā d). Their logical expressions are given below:

$$\begin{aligned}&\text {-} H_1(a,b,c,d) = (\lnot a\wedge b \equiv \lnot c \wedge \lnot d)\wedge (a\wedge \lnot b\equiv c\wedge d)\\&\text {-} H_2(a,b,c,d) = (\lnot a\wedge b\equiv c \wedge d) \wedge (a \wedge \lnot b\equiv \lnot c\wedge \lnot d)\\&\text {-} H_3(a,b,c,d) = (\lnot a\wedge \lnot b\equiv \lnot c\wedge d) \wedge (a\wedge b \equiv c \wedge \lnot d)\\&\text {-} H_4(a,b,c,d) = (\lnot a\wedge \lnot b\equiv c\wedge \lnot d) \wedge (a \wedge b\equiv \lnot c\wedge d) \end{aligned}$$

They are clearly code independent. The six patterns that make them true are given in TableĀ 6. The four heterogeneous logical proportions have a quite different semantics from the ones of homogeneous proportions. They express that there is an intruder among \(\{a,b,c,d\}\), which is not a (\(H_1\)), which is not b (\(H_2\)), which is not c (\(H_3\)), and which is not d (\(H_4\)) respectively. The reader is referred to [29] for the study of the properties of heterogeneous logical proportions. They have been shown to be appropriate for solving puzzles of the type ā€œFinding the odd one outā€ [29]. Moreover they are at the basis of an ā€œoddnessā€ measure, which has been shown to be of interest in classification, following the straightforward idea of classifying a new item in the class where it appears to the least at oddsĀ [6].

Table 6. \(H_1, H_2, H_3, H_4\) Boolean truth tables

5 Graded Analogical Proportion

Attributes or features are not necessarily Boolean, and graded extensions of logical proportions are of interest. In the following we only focus on analogical proportion. We assume that attributes are now valued in [0,Ā 1] (possibly after renormalization). There are potentially many ways of extending expressions such as Eqs.Ā (1) or (3). Still there are mainly two options that make sense [8].

The first one is obtained by replacing (i) the central \(\wedge \) in (1) by \(\min \), (ii) the two \(\equiv \) symbols by \(\min (s\rightarrow _{\L }t, t\rightarrow _{\L }s) = 1- \mid s-t\mid \), where \(s \rightarrow _{\L } t =\min (1, 1- s + t)\) is Łukasiewicz implication, (iii) the four expressions of the form \(s \wedge \lnot t\) by the bounded difference \(\max (0, s - t)= 1 - (s \rightarrow _{\L } t)\), which is associated to Łukasiewicz implication, using \(1 - (\cdot )\) as negation. The resulting expression is then

$$\begin{aligned} a : b\,::_{\L } c : d = {\left\{ \begin{array}{ll} 1- \mid (a - b) - (c-d)\mid , \\ \quad \text{ if } a \ge b \text{ and } c \ge d, \text{ or } a \le b \text{ and } c \le d\\ 1- \max (\!\mid \!a - b\!\mid , \!\mid \!c - d\!\mid ), \\ \quad \text{ if } a \le b \text{ and } c \!\ge d, \text{ or } a \ge b \text{ and } c \le d \end{array}\right. } \end{aligned}$$
(4)

It coincides with \(a:b\,::\,c : d\) on \(\{0, 1\}\). As can be seen, this expression is equal to 1 if and only if \((a - b) = (c-d)\), while \(a : b\,::_{\L } c : d =0\) if and only if (i) \(a-b = 1\) and \(c \le d\), or if (ii) \(b-a = 1\) and \(d \le c\), or if (iii) \(a \le b\) and \(c - d = 1\), or if iv) \(b \le a\) and \(d - c = 1\). Thus, \(a : b \,::_{\L } c : d =0\) when the change inside one of the pairs (a,Ā b) or (c,Ā d) is maximal, while the other pair shows either no change or a change in the opposite direction. It can be also checked that code independency continue to hold under the form \( a : b\,::_{\L } c : d = 1-a : 1-b\,::_{\L } 1-c : 1-d\).

We have pointed out that the algebraic difference between a and b equated with the difference between c and d, namely \(a-b = c-d\), provides a constraint that is satisfied by the 6 patterns making true the analogical proportion \(a:b\,::\,c : d\) in the Boolean case, and by none of the 10 others. However, \(a-b\) may not belong to \(\{0,1\}\) when \(a, b \in \{0,1\}\). When considering the graded case, the situation remains the same: \(a-b\) is not close either in [0,Ā 1], but \(a : b\,::_{\L } c : d =1\) if and only if \(a-b = c-d\); moreover, the modeling of the analogical proportion by the constraint \(a-b = c-d\) does not provide a graded evaluation of how far we are from satisfying it, as it is the case with the above extension.

There is another meaningful graded extension of the analogical proportion, which is directly obtained from Eq.Ā (3) by also taking ā€˜minā€™ for the internal conjunction and ā€˜maxā€™ for the internal disjunction. It yields the so-called ā€œconservativeā€ extension [8]:

$$\begin{aligned} a : b\,::_C c : d = \min (1-|\max (a,d)-\max (b,c)|,1-|\min (a,d)-\min (b,c)|) \end{aligned}$$
(5)

Note that Ā Ā Ā \(a : b\,::_C c : d = 1\Leftrightarrow \min (a,d)=\min (b,c)\) and \(\max (a,d)=\max (b,c)\). This means that the patterns (s,Ā s,Ā t,Ā t), and (s,Ā t,Ā s,Ā t) (and (s,Ā s,Ā s,Ā s)) are then the unique way to have the analogical proportion fully true (equal to 1). It can be checked that \(a : b\,::_{\L } c : d =1 \Rightarrow a : b\,::_C c : d =1\). For instance, \(0 : 0.5\,::_{\L } 0.5 : 1 = 1\), while \(0 : 0.5\,::_C 0.5 : 1 = 0.5\). Besides, \(a : b\,::\,c : d = 0\) if and only if \(|\min (a,d)-\min (b,c)| = 1\) or \(|\max (a,d)-\max (b,c)|=1\), i.e. the only patterns fully falsifying the analogical proportion are of the form \(1 : 0\,::\,x : 1\) or \(0 : 1\,::\,x : 0\) (and the other patterns obtained from these two by symmetry and central permutation).

6 Analogical Inference

The equation \(a : b\,::\,c : x\) in \(\mathbb {B}\) has not always a solution. Indeed neither \(0 : 1\,::\,1 : x\) nor \(1 : 0\,::\,0 : x\) have a solution (since 0111, 0110, 1000, 1001 are not valid patterns for an analogical proportion). The solution exists if and only if \((a \equiv b) \vee (a \equiv c)\) holds. When the solution exists, it is unique and given by solution \(x=c \equiv (a \equiv b)\). This corresponds to the original view advocated by Klein [17] who applied even to the cases \(0 : 1\,::\,1 : x\) and \(1 : 0\,::\,0 : x\), where it yields \(x=0\) and \(x=1\) respectively; as already said S. Klein was making no differences between \(\text {Ana}\), \(\text {Par}\) and \(\text {Rev}\).

This equation solving mechanism directly applies to Boolean vectors in \(\mathbb {B}^n\), i.e., looking for \(\varvec{x}= (x_1, \cdots , x_n)\) such as \(\varvec{a}:\varvec{b}\,::\,\varvec{c} : \varvec{x}\) holds, amounts to solving the n equations \(a_i:b_i\,::\,c_i:x_i\). When the n equations are solvable, we can observe that the analogical proportion solving process may be creative (an informal quality usually associated with the idea of analogy) in the sense that it may be the case that \(\varvec{x}\ne \varvec{a}\), \(\varvec{x}\ne \varvec{b}\), and \(\varvec{x}\ne \varvec{c}\). For instance, we obtain \(\varvec{x}=(x_1,x_2)=(0,0)\), from \(\varvec{a}= (a_1,a_2)=(1,1)\), \(\varvec{b}=(b_1,b_2)=(1,0)\), and \(\varvec{c}=(c_1,c_2)=(0,1)\).

This can be applied to completion tests such as the example of Fig.Ā 2.

Fig. 2.
figure 2

A simple analogical sequence of pictures

The problem may be encoded using the 5 Boolean predicates hasSquare(hS), hasBlack Dot(hBD), hasTriangle(hT), hasCircle(hC), hasEllipse(hE) in that order.

Fig. 3.
figure 3

A Boolean coding for Fig.Ā 2

This leads to the code of Fig.Ā 3. Applying componentwise the solving process, we get \(\varvec{x} =(0,1,1,1,0)\) which is the code of the expected solution. The approach is constructive since the missing picture \(\varvec{x}\) is obtained by computation from \(\varvec{a}, \varvec{b}, \varvec{c}\). This contrasts with the classical approaches to this problem, pioneered by Th. Evans [9] where d is to be chosen among a set of candidate pictures which contains a picture considered as being the right answer, and where the change between \(\varvec{a}\) and \(\varvec{b}\) is compared with the change between \(\varvec{c}\) and \(\varvec{x}\) for each \(\varvec{x}\), leading to choose \(\varvec{x}\) as the one maximizing the similarity between the changes. Clearly, we may imagine some sequence of pictures \(\varvec{a}, \varvec{b}, \varvec{c}\) which cannot be completed by a fourth picture \(\varvec{x}\) in the sense of analogical proportion since the equation \(\varvec{a} : \varvec{b}\,::\,\varvec{c} : \varvec{x}= \varvec{1}\) is not always solvable. When there is no analogical solution, we may think of using another homogeneous logical proportions. It should be also clear that the approach may not be suitable for solving quizzes obeying to a functional pattern of the form \(\varvec{a} : f(\varvec{a})\,::\,\varvec{b} : f(\varvec{b})\), when the features considered for defining the vectors do not account for the modeling of f; then the function f has to be guessed on the basis of some simplicity principle [1].

The equation solving process may be also restricted to a subpart of \(\varvec{x}\). This is the basic inference pattern underlying the analogical proportion-based inference, which can be described as follows: if an analogical proportion holds between p components of four vectors, then this proportion may hold for the last remaining components as well. This inference principle [35] can be formally stated as follows:

$$\frac{\forall i \in \{1,..., p\}, ~~a_i : b_i \,::\,c_i : d_i \text{ holds }}{\forall j \in \{p+1,..., n\}, ~~a_j : b_j\,::\,c_j : d_j \text{ holds }}$$

This is a generalized form of analogical reasoning, where we transfer knowledge from some components of our vectors to their remaining components, tacitly assuming that the values of the p first components determine the values of the others. Then analogical reasoning amounts to finding completely informed triples \((\varvec{a}, \varvec{b}, \varvec{c})\) suitable for inferring the missing value(s) of an incompletely informed item. In case of the existence of several possible triples leading to possibly distinct plausible conclusions, a voting procedure may be used, as in case-based reasoning where the inference is based on a collection of single cases (i.e., the nearest neighbors) rather than on a collection of triples. This inference pattern can be generalized when the p attributes include numerical ones, by computing an average score of the qualities of the analogical proportions over the p components (using one of the graded extensions of Sect.Ā 5), and choosing the prediction such that the sum of the average scores to which it is associated, is maximal.

7 Applications

We briefly survey some existing or potential applications of analogical proportions.

Classification. Classification is an immediate application of the above inference principle where one has to predict a class \(cl(\varvec{x})\) (viewed as a nominal attribute) for a new item \(\varvec{x}\). Then one looks for triples \((\varvec{a}, \varvec{b}, \varvec{c})\) of items with a known class, for which the class equation \(cl(\varvec{a}) : cl(\varvec{b})\,::\,cl(\varvec{c}) : cl(\varvec{x})\) is solvable, and for which analogical proportions hold with \(\varvec{x}\) on the attributes describing the items. It has been first successively applied to Boolean attributes [4, 21] and then extended to nominal and to numerical ones [5]. Recent formal studies have shown that analogical classifiers always give exact predictions in the special cases where the classification process is governed by an affine Boolean function (which includes x-or functions) and only in this case, which does not prevent to get good results in other cases (as observed in practice), but which is still to be better understood [7, 16]. This suggests that analogical proportions enforces a form of linearity, just as numerical proportions fit with linear interpolation.

Raven Tests. They are IQ tests, where one is faced with a \(3 \times 3\) matrix with 8 cells containing pictures, where one has to guess what is the right contents of the empty ninth cell, among 8 proposed solutions. An exampleFootnote 1 is given with its solution (a simple big square) in Fig.Ā 4. The idea is to postulate that in a line (and maybe in a column), the picture of the 3rd cell is to the pictures of the first two pictures as the picture of the 3rd cell is to the pictures of the first two pictures in the next line (or column). It amounts to dealing with proportions of the form \((\varvec{c}ell_1, \varvec{c}ell_2): \varvec{c}ell_3\,::\,(\varvec{c}ell'_1, \varvec{c}ell'_2): \varvec{c}ell'_3\) (where the \(\varvec{c}ell_i\)ā€™s refer to feature-based vectors describing the feature. Then the application of the analogical inference amounts to copying patterns observed in other lines or columns, feature by feature. Again the solution is built, and not chosen. See [1] for details and discussions.

Fig. 4.
figure 4

Modified Raven test 12 and its solution

Analogy-Based Decision. Let us just outline the idea using generic scenarii; see [2] for details. Suppose that a decision \(\delta \) was experienced in two different situations \(sit_1\) and \(sit_2\) in the presence or not of special circumstances, leading to good or bad results respectively depending on the absence or on the presence of these special circumstances. Suppose we have in our repository the first three lines of the following table (cases a, b, c), while we wonder if we should consider applying decision \(\delta \) or not in \(sit_2\) when no special circumstances are present (case d). The analogical inference leads here to the prediction that the result should be good.

figure a

Note that if we apply a case-based decision view, case d might be found quite similar to case c, since they are identical on all the features used for describing situation \(sit_2\), and differs only on the maybe unique feature describing the so-called ā€œspecial circumstancesā€; this would lead to favor the idea that decision \(\delta \) in case d would also lead to a bad result as in case c. Still, a more careful examination of cases a,Ā b,Ā c may lead to an opposite conclusion. Indeed it is natural to implicitly assume here that the possibly many features gathered here under the labels ā€œsituationā€ and ā€œspecial circumstancesā€ are enough for describing the cases and for determining the quality of the result of decisions applied to the cases. Thus, the fact that in \(sit_1\), the quality of the result of decision \(\delta \) is bad (resp. good) is explained by the presence (resp. absence) of ā€œspecial circumstancesā€. Then the analogical inference enforces here that we should have the same behavior in \(sit_2\).

Rather than analogically predicting the evaluation of the output of a potential decision in a new situation, one may suppose that we start with a repertory of recommended actions in a variety of circumstances, and then one may also think of trying to take advantage of the creative capabilities of analogy for adapting a decision to the new situation. This may be useful when the final decision has diverse options. Such as ServeĀ aĀ tea with or without sugar, with or without milk. Let us consider this example to illustrate the idea. As stored in the table below, in situation \(sit_1\) with contraindication (\(c \ i\)), it is recommended to serve tea only, in situation \(sit_1\) with no \(c \ i\), tea with sugar, while in situation \(sit_2\) with \(c \ i\) one serves tea with milk. What to do in situation \(sit_2\) with no \(c \ i\) ? Common sense suggests tea with sugar and milk, maybe. It is what analogical proportion equations says: indeed \(\delta :\delta \,::\,\delta : x\), \(0:1\,::\,0 : y\) and \(0:0\,::\,1 : z\) yield \(xyz= \delta 11\) as in the table below.

figure b

Analogical Inequalities. An analogical proportion states that the results of the comparisons of a and b on the one hand, and of c and d on the other hand, in terms of dissimilarity indicators, are the same. Analogical inequalities [31] weaken such statements of identity into statements of the form ā€œa is to b at least as much as c is to dā€. Starting from the Boolean expression (1) of the analogical proportion, we replace the two symbols \(\equiv \) expressing sameness by two \(\rightarrow \) for modeling the fact that the result of the comparison of c and d is larger or equal to the result of the comparison of a and b. Namely, we obtain

$$\begin{aligned} a : b<\!\!\!< c : d = ((a \wedge \lnot b) \rightarrow (c \wedge \lnot d)) \wedge ((\lnot a \wedge b)\rightarrow (\lnot c \wedge d)) \end{aligned}$$
(6)

It can be checked from the definition that the following expected properties hold:

  • \(a : b<\!\!\!< a : b\)

  • \(a : b\,::\, c : d \Rightarrow a : b<\!\!\!< c : d\)

  • \(a : b\,::\,c : d \Leftrightarrow ((a : b<\!\!\!< c : d)\wedge (c : d<\!\!\!< a : b))\)

  • \((a : b<\!\!\!< c : d) \Leftrightarrow (\lnot a : \lnot b<\!\!\!< \lnot c : \lnot d)\)

Indeed, \(a : b<\!\!\!< c : d\) is weaker than \(a : b\,::\,c : d\). More precisely \(a:b<\) \(<c:d\) holds true for the 6 patterns that makes analogical proportion true, plus the 4 patterns 0001, 0010, 1110, 1101. These latter patterns correspond to the 4 situations where \(a \equiv b\) and \(c \not \equiv d\). In these 4 situations a and b are indeed strictly closer than c and d, and these are the only cases in \(\{0,1\}\). It can be checked that \(a : b<\!\!\!< c : d\) is true if and only if \((a : b\,::\,c : d) \vee (a \equiv b)\) is true.

When extended to the multiple-valued case [31], we obtain graded analogical inequalities that might be of interest in visual multiple-class categorization tasks for the handling of pieces of knowledge about semantic relationships between classes of the form ā€œa is to b at least as much as c is to dā€ where a,Ā b,Ā c,Ā d refer to the value of a feature of interest in 4 different images [18].

8 Conclusion: Link and Differences with Case-Based Reasoning

As suggested in this overview, and already emphasized in [26], analogical proportion-based inference departs from case-based reasoning since the former takes advantage of triples for extrapolating plausible conclusions, while the latter exploits the similarity of the new case with a collection of stored cases considered one by one. Indeed, although ā€œ\(<solution_1>\) is to \(<problem_1>\) as \(<solution_2>\) is to \(<problem_2>\)ā€ may be regarded as an analogical proportion, the view presented here assumes that the vectors representing the four items in the analogical proportion ā€œ\(\varvec{a}\) is to \(\varvec{b}\) as \(\varvec{c}\) is to \(\varvec{d}\)ā€ are all defined on the same set of features. As suggested when discussing analogy-based decision, we would rather suggest to exploit analogical proportions of the form \((<problem_1>,<solution_1>): (<problem_2>,<solution_2>)\,::\,\) \((<problem_3>,<solution_3>): (<problem_0>,<solution_0>)\) for extrapolating \(<solution_0>\) from 3 known cases (\(\{(<problem_i>,<solution_i>) \ | \ i=1,3\}\)) by solving equation \(<solution_1>:<solution_2>\,::\,<solution_3>: <solution_0>\) (where \(<solution_0>\) is unknown), provided that \(<problem_1>: <problem_2>\,::\,\) \(<problem_3>: <problem_0>\) holds.

The view advocated here is also in line with the use of the creative power of analogical reasoning for, e.g., creating a new recipe from known ones, as suggested by the following example where one knows about lemon pie (\(\varvec{a}\)), lemon cream (\(\varvec{b}\)), and apple pie (\(\varvec{c}\)) (roughly described by obvious features here), leading to the creation of the apple cream, in a spirit not so far of adaption methods in case-based reasoning [10].

figure c

Moreover, the computation of the result of the inference may give birth to some explanation: pies and creams are two desserts, both lemon and apple are juicy fruits, lemon pie and lemon cream exist, apple pie exists, why not trying apple cream ?

The handling of analogical proportions ā€œ\(\varvec{a}\) is to \(\varvec{b}\) as \(\varvec{c}\) is to \(\varvec{d}\)ā€ where \(\varvec{a}\) and \(\varvec{c}\) do no belong to the same conceptual universe as \(\varvec{b}\) and \(\varvec{d}\), as in the sentence ā€œOslo is to Norway as Paris is to Franceā€ is more tricky [20, 23] and still under study.