Abstract
The link between information theory and fuzzy logic has been proven in several previous papers. From this starting point, we propose here a review about the concept of divergence measures, which was proposed as a tool for comparing two fuzzy sets. The initial definition comes from the ideas behind the classical concept of divergence between two probability distributions. Following a path similar to the one considered to obtain fuzziness measures from uncertainty measures, we are able to define fuzzy divergences. Apart from that, some possible generalizations are considered.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
1 Introduction
Dealing with lack of information is a usual problem in many areas. This lack of information can be given in two different ways: uncertainty or imprecision. In the first case, we deal with experiments where we can have more than one possible outcome, each possible outcome can be specified in advance, but the outcome of the experiment depends on chance. For instance, in a coin toss, we know the two possible outcomes, head or tail, but we do not know the final result. In the second case, we have no uncertainty about the result of the experiment but imprecision. Thus, for instance, if we consider again the experiment of the coin toss, the coin could be already thrown but maybe it is too old and we are not sure that the face it shows is clearly a head.
Information theory studies the quantification and communication of the information and, in particular, it measures the amount of uncertainty involved in the value of a random experiment. It was originally proposed by Shannon [31] in 1948 as a tool in signal processing. Thus, this theory combines a lot of different fields such as mathematics, statistics, computer science, physics and electrical engineering. From the beginning, this theory was revealed as an interesting tool in many other areas and therefore a lot of researchers started to work on it (Rényi [30], Oniçescu [27], Sharma and Mittal [32], Havrda and Charvát [11], etc.). Later, an important step was given by Kampé de Fèriet and Forte [12] with an axiomatic definition of the information with or without a probability measure. From the theoretical aspects of this theory, Kullback [17] found a lot of interesting applications in statistical inference. From this initial application a lot of papers have been developed in this area. In particular, some very important achievements have been obtained by Pardo (see, among others, [28, 29]). An important review about all these theories can be found in Gil [7], since he was one of the most important researchers in this area in Spain. Divergence measures between probability distributions were an important topic on this monograph and it is the starting point of this chapter, as we will see later.
On the other hand, Zadeh [34] introduced in 1965 the concept of fuzzy set, as a way to model vague or poorly defined properties for situations in which it is not possible to fully discriminate between having and not having the said properties. From that, a whole mathematical and applied theory to deal with imprecision was developed. It is known as Fuzzy Logic Theory. Two interesting monographs about this theory were written by Dubois and Prade [6] and Klir and Folger [13].
As we can see from the title of this last book, the concepts fuzzy sets, uncertainty and information are mixed. This is not by chance, since these topics are very related, as we can see in [8,9,10]. In particular, we have studied [24] the relationship between uncertainty measures defined in Information Theory [12] and the fuzziness measures introduced by De Luca and Termini [5] and later analyzed in a deeper way by Knopfmacher [14]. The link between measures of uncertainty and imprecision in fuzzy environments will lie in what we will refer as divergence measure, because of the analogy with the classical meaning of the term used in comparing two probability distributions (see, for instance, [29]). The main purpose of this chapter is to use these measures to compare two fuzzy sets.
As introductory notions, we present two axiomatic definitions to measure the entropy–uncertainty measures and fuzziness measures in Sect. 2. A study on the relationship between them, in the most general context, is given there. The definition of divergence measure between fuzzy sets is given in Sect. 3 following the ideas considered previously. The most important results are contained in that section where we also comment some extensions. Finally, we conclude the work with some comments in Sect. 4.
2 Preliminary
Necessary concepts to understand the remaining parts of this work are given in this section. In particular, we will focus on the definitions and notations for uncertainty measures and fuzziness measures.
2.1 Uncertainty Measures
The first probabilistic uncertainty measure (also called entropy) was given by Shannon [31] in the context of Communication Theory. That initial definition considered that the uncertainty for a random experiment can be measured by means of the quantity
where values \(p_i\) represent the probabilities of the possible results of the experiment.
From that initial definition, a lot of generalizations have been proposed in the literature.
Thus, Menéndez et al. [19] proved that all these measures of entropy are part of a wider family, which are named h-\(\phi \)-entropies.
This family is slightly more general than Ben Bassat’s family of f-entropies that were defined as those functions that can be expressed like
where f is a concave function.
Later, the quasi-\(\phi \)-entropies were introduced and characterized in the case of discrete distributions [3]. Thus, it is a family more general than Ben Bassat’s one but different from the family of h-\(\phi \)-entropies. More precisely, they are defined by
where \(\phi \) is a function such that \(\phi (\lambda x + (1-\lambda ) y) \ge \lambda \phi (x) + (1-\lambda ) \phi (y), \forall x,y\in [0,1],\, x+y\le 1\).
An important property of uncertainty measures is the Principle of Transfer or Pigou–Dalton’s condition. An uncertainty measure H fulfils this property if given two probability distributions P and \(P^{'}\) with parameters \((p_1,\) \(p_2, \ldots ,\) \(p_n)\) and \((p'_1, p'_2, \ldots , p'_n)\) respectively, then \(H(P)\le H(P')\), where, \(p_k = p'_k,\, \forall k\notin \{i,j\}\) and \(p'_i = p_i+\delta ,\, p'_j = p_j-\delta \) for some \(\delta \le (p_i-p_j)/2\).
It is a very logical property, since it means that the more similar the probabilities of two outcomes of an experiment are, the higher the uncertainty is.
2.2 Fuzziness Measures
After having commented some results about uncertainty measures or probabilistic entropies, we now introduce fuzzy sets and the measures of their fuzziness, i.e., the non-probabilistic entropies.
They are well-known and can be found in a wide range of sources (see, for instance, the classical books [6, 13]).
The universal set is denoted by X. A fuzzy subset of X is a mapping from X into the unit interval [0, 1].
In this framework, we use the following notations:
-
\(\mathscr {P}(X)\) is the set of all subsets of X,
-
\({\mathscr {F}}(X)\) is the set of all fuzzy subsets of X,
-
\(A \in \mathscr {P}(X)\) will denote any crisp set,
-
\(\widetilde{A} \in {\mathscr {F}}(X)\) will denote any fuzzy set.
We identify a fuzzy set and its membership function. Thus we have that \(X(x)=1\) for all \(x \in X\) and for the empty set we have \(\emptyset (x)=0\) for all \(x \in X\).
Two further important concepts are the containment relation and the complement set. We consider the standard Zadeh’s negation for the complement (see [34]).
Definition 2.1
Let \(\widetilde{A}, \widetilde{B}\in \mathscr {F}(X)\). The complement of \(\widetilde{A}\) is the fuzzy set \(\widetilde{A}^c(x)=1-\widetilde{A}(x)\), \(x\in X\). \(\widetilde{A}\) is contained in \(\widetilde{B}\), denoted by \(\widetilde{A}\subseteq \widetilde{B}\) if \(\widetilde{A}(x)\le \widetilde{B}(x)\) for all \(x\in X\).
Apart from the previous relation of containment, we consider the concepts of intersection and union of fuzzy sets. The initial definitions were also given in [34] by means of the minimum and the maximum operators.
However, they are not the only way to generalize the classical set operations, since there exists a broader class of functions to represent them. In fact, for the intersection, this class is referred as t-norm and for the union as t-conorm.
A triangular norm (t-norm) is a function \(T:[0,1] \times [0,1] \rightarrow [0,1]\) satisfying the following properties:
-
(T1)
\(T(a,b)=T(b,a)\), for all \(a,b\in [0,1]\),
-
(T2)
\(T(T(a,b),c)=T(a,T(b,c))\), for all \(a,b,c\in [0,1]\),
-
(T3)
\(b \le c \Rightarrow T(a,b) \le T(a,c)\), for all \(a,b,c\in [0,1]\),
-
(T4)
\(T(a,1)=a\), for all \(a\in [0,1]\).
Some important examples of t-norms are:
-
Minimum: \(T_M(a,b)=\min (a,b)\), for all \(a,b\in [0,1]\),
-
Product: \(T_P(a,b)=a\cdot b\), for all \(a,b\in [0,1]\),
-
Łukasiewicz t-norm: \(T_L(a,b)=\max (a+b-1,0)\), for all \(a,b\in [0,1]\),
-
Drastic t-norm:
$$\begin{aligned} T_D(a,b)= \left\{ \begin{array}{ll} \min (a,b), &{} \text { if } \max (a,b)=1 \\ 0, &{} \text { otherwise} \end{array}\right. \,. \end{aligned}$$
For these basic t-norms, it holds that \(T_D \le T_L \le T_P \le T_M\). In fact, for any t-norm T, it is true that \(T_D \le T \le T_M\). By changing the neutral element from 1 to 0, we obtain the triangular conorms (t-conorm).
A t-norm T and a t-conorm S are dual iff for each \(a,b \in [0,1]\) it holds that \(T(a,b)=1-S(1-a,1-b)\).
The dual conorms of the t-norms presented earlier are the following:
-
Maximum: \(S_M(a,b)=\max (a,b)\), for all \(a,b\in [0,1]\),
-
Probabilistic sum: \(S_P(a,b)=a+b-a\cdot b\), for all \(a,b\in [0,1]\),
-
Łukasiewicz t-conorm: \(S_L(a,b)=\min (a+b,1)\), for all \(a,b\in [0,1]\),
-
Drastic t-conorm:
$$\begin{aligned} S_D(a,b)= \left\{ \begin{array}{ll} \max (a,b), &{} \text { if } \min (a,b)=0 \\ 1, &{} \text { otherwise } \end{array}\right. \,. \end{aligned}$$
Using t-norms and t-conorms, we can define the intersection and union of two fuzzy sets as follows.
Definition 2.2
Let \(\widetilde{A}, \widetilde{B} \in \mathscr {F}(X)\). Given a t-norm T and a t-conorm S,
-
\(\widetilde{A} \cap \widetilde{B}(x)=T(\widetilde{A}(x),\widetilde{B}(x)), \forall x\in X\);
-
\(\widetilde{A} \cup \widetilde{B}(x)=S(\widetilde{A}(x),B(x)), \forall x\in X\).
Thus, we can denote by (X, T, S) the triple formed by the universe with the t-norm and the t-conorm defining the intersection and the union, respectively.
The entropy for a fuzzy set is quantified by means of the non-probabilistic entropies or fuzziness measures (see, for instance, [33]), which are defined as follows.
Definition 2.3
A fuzziness measure is a real function f defined on \({\mathscr {F}}(X)\), fulfilling the following requirements:
-
(a)
\(f(\widetilde{A}) = 0 \Longleftrightarrow \widetilde{A}\) is a crisp set.
-
(b)
If \(\widetilde{A}, \widetilde{B} \in {\mathscr {F}}(X)\) and \(\widetilde{A}\) is “sharper” than \(\widetilde{B}\), then \(f(\widetilde{A}) \le f(\widetilde{B})\).
-
(c)
\(f(\widetilde{A})\) takes maximum value if and only if \(\widetilde{A}\) is “maximally fuzzy”.
This last definition is based on the concepts “sharper than” and “maximally fuzzy”, although the second one follows from the former. Thus, the most usual criteria to define the relation “to be sharper than” are the following:
-
\(\widetilde{A}\) is sharper than \(\widetilde{B}\) iff either \(\widetilde{A}(x)\le \widetilde{B}(x)\le 1/2\) or \(\widetilde{A}(x) \ge \widetilde{B}(x)\ge 1/2\) for any x in X (see [13]) or
-
\(\widetilde{A}\) is sharper than \(\widetilde{B}\) iff \(|\widetilde{A}(x)-1/2|\le | \widetilde{B}(x)- 1/2|\) for any x in X (see [6]).
It is clear that the first one is a particular case of the second one and therefore we are going to consider the most general definition.
Knopfmacher introduced in 1975 a very important family of fuzziness measures, the Knopfmacher class [14], which is given by the functions f such that
for any \( \widetilde{A}\) in \(\mathscr {F}(X)\) where \(c_x\in \mathbb {R}^{+}\); \(g_x\) is a real-valued function such that \(g_x(0) = g_x(1) = 0,\, g_x(t) = g_x(1-t), \forall t\in [0,1] \) and \(g_x\) is strictly increasing on [0, 1 / 2]; F is a positive strictly increasing function with \(F(0)=0\).
Later, we consider a particular class of Knopfmacher fuzziness measure (see [20, 22]) when F is the identity, \(g_x\) is the same for all \(x\in X\) (we denoted \(g_x\) by \(u_f\) or simply u) and u is concave. Any function in this family was named local fuzziness measure.
2.3 From Uncertainty to Fuzziness
Proposition 2.1
([24]) Let \((X,\mathscr {A},\mu )\) be a measurable space and let H be an uncertainty measure fulfilling the Pigou-Dalton’s condition and such that \(H(P) = 0 \Longleftrightarrow P\) is degenerate. The map f defined as follows:
is a fuzziness measure and it belongs to the Knopfmacher’s class.
If we work on some particular spaces, we are also able to establish a one-to-one correspondence between fuzziness measures and uncertainty measures.
Thus, if we consider the subset of uncertainty measures given by
we have the injective property, as we can see in the following proposition.
Proposition 2.2
([24]) If \(F_1\) is a map from \({\mathscr {H}}_2\) in \({\mathscr {F}}\) such that
where \({\mathscr {F}}\) denotes the Knopfmacher’s class of fuzziness measures, then we have that \(F_1\) is injective.
If we restrict our study to the family of \(\phi \)-entropies given by \({\mathscr {H}}_{\phi } = \{H\in {\mathscr {H}}_2 | \phi \text { is concave}\}\) and the family of fuzziness measures given by \({\mathscr {F}}_1 = \{ f\in {\mathscr {F}}~\text {with }~g~\text {continue}\}\) we have the bijection.
Theorem 2.1
([24]) There exists a one-to-one correspondence between the family of uncertainty measures \({\mathscr {H}}_{\phi }\) and the family of fuzziness measures \({\mathscr {F}}_1\).
3 Divergence Measures
From the previous section, we could notice that the imprecision about the membership of any element \(x\in X\) in a fuzzy set \(\widetilde{A}\) could be represented by a probability distribution \(\{\widetilde{A}(x),\widetilde{A}^c(x)\}\). Then, we looked at the classical divergence measures between probability distributions (see, for instance, [7, 29]) to try to compare two fuzzy sets.
Thus, from this starting point, we proposed a new way to compare two fuzzy sets [20], the divergence, with the following properties:
-
It becomes zero when the two sets coincide.
-
It is a nonnegative and symmetric function.
-
It decreases when the two sets become “more similar” in some sense.
While it is easy to formulate the first and the second conditions analytically, the third one depends on the formalization of the concept “more similar”. We base our approach on the fact that if we add a set \(\widetilde{C}\) to both fuzzy sets \(\widetilde{A},\widetilde{B}\), we obtain two subsets which are closer to each other; the same with the intersection.
Definition 3.1
Let (X, T, S) be a triple with X a universe and T and S any t-norm and t-conorm, respectively. A map \(D: {\mathscr {F}}(X)\times {\mathscr {F}}(X) \rightarrow \mathbb {R}\) is a divergence measure with respect to (X, T, S) iff for all \(\widetilde{A}, \widetilde{B} \in {\mathscr {F}}(X)\), D satisfies the following conditions:
-
(a)
\(D(\widetilde{A},\widetilde{A}) = 0\);
-
(b)
\(D(\widetilde{A},\widetilde{B}) = D(\widetilde{B},\widetilde{A})\);
-
(c)
\(\max \{D(\widetilde{A}\cup \widetilde{C}, \widetilde{B}\cup \widetilde{C}), D(\widetilde{A}\cap \widetilde{C}, \widetilde{B}\cap \widetilde{C})\} \le D(\widetilde{A},\widetilde{B})\), for all \(\widetilde{C}\in {\mathscr {F}}(X)\), where the union and intersection are defined by means of S and T, respectively.
It is clear that a divergence measure is associated to a triple (X, T, S) and a map D can be a divergence measure with respect to a t-norm and it cannot be a divergence measure with respect to a different t-norm.
However, when there is not ambiguity, we will call just divergence measure without specifying the used t-norm and t-conorm.
After different studies of this concept [2, 20, 22,23,24], we presented the most general study in [15], where we can also find the following examples.
Example 3.1
([15]) The map
is a divergence for any triple (X, T, S).
On the other hand, if we consider the map
where \(\alpha _x \ge 0\) for any \(x\in X\), \(\sum _{x\in X} \alpha _x=1\) and X is a finite space, D is a divergence for the minimum t-norm, the product t-norm or the Łukasiewicz t-norm, but it is not for the drastic t-norm.
A divergence measure can be seen as a particular case of dissimilarity when the minimum t-norm is considered, which is the most usual way to compare two fuzzy sets [18].
Moreover, it avoids some counterintuitive examples for dissimilarities, while both divergence and dissimilarity measures can be seen as a particular case of the general measures of comparison given by Bouchon–Meunier et al. [1] in 1996. An interesting study about different ways to compare fuzzy sets can be found in [4].
From this starting point, we have been able to generalize this concept to define the divergence measure for comparing two intuitionistic fuzzy sets [25].
The particular case of local divergences for intuitionistic fuzzy sets was studied in [26]. There, we presented interesting applications of this concept in Pattern Recognition and Decision Theory.
A similar generalization has been done for hesitant fuzzy sets in [16].
Moreover, we have been able to use the divergences to measure the fuzziness of a fuzzy set by comparing it with the closest crisp set and conversely, we have used fuzziness measures to define a divergence measure [21].
All these definitions and results can be considered as a heritage of the classical divergence measures, and more precisely, of the knowledge about them conveyed by Prof. Gil to the authors of this work.
4 Conclusion
In this paper we have studied some relationships among different ways to compare two elements, under uncertainty and imprecision.
Thus, we have used the classical divergence measures between two probability distributions to obtain a new way to compare two fuzzy sets. This is a particularly interesting case of dissimilarity in some cases and it has very interesting and specific properties.
The link between randomness and fuzziness is proven one more time, as we did previously for probabilistic and non-probabilistic entropies.
References
Bouchon-Meunier B, Rifqi M, Bothorel S (1996) Towards general measures of comparison of objects. Fuzzy Sets Syst 84:143–153
Couso I, Montes S (2008) An axiomatic definition of fuzzy divergence measures. Int J Uncertain Fuzziness Knowl Based Syst 16(1):1–17
Couso I, Gil P (1998) Characterization of a family of entropy measures. In: Proceedings of IPMU’98. Editions EDK, Paris
Couso I, Garrido L, Sánchez L (2013) Similarity and dissimilarity measures between fuzzy sets: a formal relational study. Inform Sci 229:122–141
De Luca A, Termini S (1972) A definition of a nonprobabilistic entropy in the setting of fuzzy sets theory. Inf Control 20:301–312
Dubois D, Prade H (1980) Fuzzy sets and systems: theory and applications. Academic Press, New York
Gil P (1981) Teoría Matemática de la Información. ICE, Madrid
Gil MA, Gil P (2015) Randomness and fuzziness: combined better than unified. In: Magdalena L, Verdegay JL, Esteva F (eds) Enric trillas: a passion for fuzzy sets, studies in fuzziness and soft computing, vol 322. Springer, Cham
Gil MA, López MT, Gil P (1985) Quantity of information; comparison between information systems: 1. Non-fuzzy states. Fuzzy Sets Syst 15:65–78
Gil MA, López MT, Gil P (1985) Quantity of information; comparison between information systems: 2. Fuzzy states. Fuzzy Sets Syst 15:129–145
Havrda J, Charvát F (1967) Quantification method of classification processes. Concept of structural \(\alpha \)-entropy. Kybernetika 3(1):30–35
Kampé de Fériet J, Forte B (1967) Information et probabilité. CR Acad Sci Paris Ser A 265:110–114, 142–146, 350–353
Klir GJ, Folger TA (1988) Fuzzy sets, uncertainty, and information. Prentice Hall, Upper Saddle River
Knopfmacher J (1975) On measures of fuzziness. J Math Anal Appl 49:529–534
Kobza V, Janis V, Montes S (2017) Generalizated local divergence measures. J Intel Fuzzy Syst 33:337–350
Kobza V, Janis V, Montes S (2017) Divergence measures on hesitant fuzzy sets. J Intel Fuzzy Syst 33:1589–1601
Kullback S (1959) Information theory and statistics. Wiley, New York
Lui X (1992) Entropy, distance measure and similarity measure of fuzzy sets and their relations. Fuzzy Sets Syst 52:305–318
Menéndez ML, Morales D, Pardo L, Salicrú M (1993) Asymptotic distribution of \((h,\phi )\)-entropies. Commun Stat Theory Meth 22(7):2015–2031
Montes S, Gil P (1998) Some classes of divergence measures between fuzzy subsets and between fuzzy partitions. Mathw Soft Comput 5:253–265
Montes S, Couso I, Bertoluzza C (1998) Some classes of fuzziness measures from local divergences. Belg J Oper Res Stat Comput Sci 38:37–49
Montes S, Gil P, Bertoluzza C (1998) Divergence between fuzzy sets and fuzziness. In: Proceedings of IPMU’98. Editions EDK, Paris
Montes S, Couso I, Gil P, Bertoluzza C (2002) Divergence measures between fuzzy sets. Int J Approx Reason 30(2):91–105
Montes S, Couso I, Jimenez J, Gil P (2005) Las medidas de incertidumbre probabilística y no probabilística como herramienta en la comparación de conjuntos. In: Volumen homenaje al Profesor Ildefonso Yáñez de Diego. UNED, Madrid
Montes I, Pal N, Janis V, Montes S (2015) Divergence measures for intuitionistic fuzzy sets. IEEE Trans Fuzzy Syst 23(2):444–456
Montes I, Pal N, Janis V, Montes S (2016) Local divergences for Atanassov intuitionistic fuzzy sets. IEEE Trans Fuzzy Syst 24(2):360–373
Oniçescu O (1966) Theorie de l’information. Energie informationelle. CR Acad Sci Paris Ser A 263:841–842
Pardo L (1997) Teoría de la Información Estadística. Hespérides, Salamanca
Pardo L (2006) Statistical inference based on divergence measures. Chapman & Hall, Boca Raton
Rényi A (1966) Calcul des Probabilitiés. Dunod, Paris
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(379–423):623–656
Sharma BD, Mittal DP (1975) New nonadditive measures of entropy for discrete probability distributions. J Math Sci 10:28–40
Trillas E, Riera T (1978) Entropies in finite fuzzy sets. Inf Sci 15:159–168
Zadeh L (1965) Fuzzy sets. Inf Contr 8:338–353
Acknowledgements
We would like to acknowledge the help and support of Prof. Gil to initiate us to the wonderful world of research. He used to say he was our scientific father and we are honored he really was.
From the economical point of view, this work was partially supported by the research project TIN2014-59543-P (Spain).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Montes, S., Díaz, S., Martinetti, D. (2018). Divergence Measures: From Uncertainty to Imprecision. In: Gil, E., Gil, E., Gil, J., Gil, M. (eds) The Mathematics of the Uncertain. Studies in Systems, Decision and Control, vol 142. Springer, Cham. https://doi.org/10.1007/978-3-319-73848-2_62
Download citation
DOI: https://doi.org/10.1007/978-3-319-73848-2_62
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73847-5
Online ISBN: 978-3-319-73848-2
eBook Packages: EngineeringEngineering (R0)