Abstract
The interpretation of degrees of membership as statistical likelihood is probably the oldest interpretation of fuzzy sets. It allows in particular to easily incorporate fuzzy data and fuzzy inferences in statistical methods, and sheds some light on the central role played by extension principle and \(\alpha \)-cuts in fuzzy set theory.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Most works on fuzzy set theory do not give any precise interpretation for the values of membership functions. This is not a problem as far as the works remain in the realm of pure mathematics. However, as soon as examples of application are included an interpretation is needed, otherwise not only the membership functions are arbitrary, but also all rules applied to them are unjustified [3, 25, 32].
In this paper, the interpretation of the values of membership functions in terms of likelihood is reviewed. The concepts of probability and likelihood were clearly distinguished by Fisher [19]: likelihood is simpler, more intuitive, and better suited to information fusion [6, 8]. The likelihood interpretation of fuzzy sets is elucidated in Sect. 2, while Sect. 3 shows that it justifies an expression for the likelihood function induced by fuzzy data that appeared often in the literature [13, 20, 23, 26, 35], but without a clear justification. This likelihood function can also be interpreted as resulting from an errors-in-variables model or measurement error model [5], as will be illustrated by a simple example. Finally, Sect. 4 discusses the interpretation of \(\alpha \)-cuts as confidence intervals, while the last section concludes the paper and outlines future work.
2 The Likelihood Interpretation
A fuzzy set is described by its membership function \(\mu :\mathcal {X} \rightarrow [0,1]\), where \(\mathcal {X}\) is a nonempty (crisp) set [34]. A standard example is the fuzzy set representing the meaning of the word “tall” in relation to a man, where the elements of \(\mathcal {X}\) are the possible values of a man’s height in cm [36]. We can expect for instance that \(\mu (180)>\mu (160)\), because the attribute “tall” fits better to a 180 cm man than to a 160 cm one. However, the concept of a fuzzy set as described by a real-valued membership function \(\mu \) can only be used to model the reality if we have an interpretation for the numerical values of \(\mu \).
In fact, a clear interpretation of membership functions should be the starting point of a theory of fuzzy sets that describes the real world, and all rules of the theory should be a consequence of the interpretation [3, 25, 32]. This is for example the case with the theory of probability, whose rules are a consequence of each of its interpretations (at least on finite spaces). As suggested by this example, it is not necessary that the interpretation is unique, but only the rules that are implied by the considered interpretation should be used in applications.
One of the first aspects to consider when discussing the interpretation of fuzzy sets is if they are used in an epistemic or ontic sense [13, 15]. Fuzzy sets have an ontic interpretation when they are themselves the object of inquiry, while they have an epistemic interpretation when their membership function \(\mu :\mathcal {X}\rightarrow [0,1]\) only gives information about the real object of inquiry, which is the value of \(x\in \mathcal {X}\). In this paper, we will only consider epistemic fuzzy sets, and focus on their interpretation in terms of likelihood.
The likelihood interpretation of a fuzzy set consists in interpreting its membership function \(\mu :\mathcal {X}\rightarrow [0,1]\) as the likelihood function lik on \(\mathcal {X}\) induced by the observation of an event D:
for all \(x\in \mathcal {X}\), where \(P(D\,|\,x)\) was the probability of the event D (before its realization) given the value of \(x\in \mathcal {X}\).
For example, “John is tall” is a piece of information that can be modeled by a fuzzy set with membership function \(\mu :\mathcal {X}\rightarrow [0,1]\) with \(\mu (x)\propto P(D\,|\,x)\), where the elements of \(\mathcal {X}\) are the possible values of John’s height in cm, and \(P(D\,|\,x)\) is the probability of the event D of getting the information that “John is tall” when John’s height is x cm. Hence, the exact meaning of the interpretation of fuzzy sets in terms of likelihood depends on the interpretation given to probability values, but as noted above, the choice of this interpretation does not affect the rules of probability theory.
The likelihood interpretation is probably the oldest interpretation of fuzzy sets: it has been more or less explicitly used directly after [27] and even before [2, 29] the mathematical concept of fuzzy set was introduced by Zadeh [34], and has later been studied in detail by several authors [1, 10–12, 14, 16, 17, 22, 24, 30, 31]. However, most of them interpreted membership functions \(\mu \) in terms of probability values \(\mu (x)=P(D\,|\,x)\), instead of likelihood values \(\mu (x)=lik(x\,|\,D)\). Historically, the subtle distinction between probability and likelihood confused several great minds, before the likelihood of \(x\in \mathcal {X}\) was clearly defined by Fisher as proportional to the probability of the data D given x [18, 19, 21].
The proportionality constant in the definition of \(lik(x\,|\,D)\) can depend on anything but the value of \(x\in \mathcal {X}\). The reason for defining the likelihood function lik only up to a multiplicative constant is that otherwise lik would strongly depend on irrelevant information. For example, if two persons chosen at random from a population independently tell us that John is “tall” and “very tall”, respectively, then the resulting fuzzy set should not change completely if we would or would not have the additional information that the first person said “tall” and the second one “very tall”.
Interpreting fuzzy sets in terms of likelihood thus implies that proportional membership functions have the same meaning. Uniqueness of representation is recovered by assuming, as is often done anyway, that all fuzzy sets are normalized. That is, their membership functions \(\mu :\mathcal {X} \rightarrow [0,1]\) satisfy \(\sup _{x\in \mathcal {X}}\mu (x)=1\), and are thus uniquely determined by \(\mu (x)\propto P(D\,|\,x)\). Surprisingly, very few authors seem to have somehow considered this important aspect of the likelihood interpretation, and not in a very explicit way [14, 25, 31].
3 Fuzzy Data
A basic advantage of the likelihood interpretation of fuzzy sets is that it allows to directly obtain statistical inferences from fuzzy data. The only condition on the statistical methods used is that the data enter them through the likelihood function only. In particular, all methods from the likelihood and Bayesian approaches to statistics can be straightforwardly generalized to the case of fuzzy data.
As discussed in Sect. 2, the membership function of a fuzzy set \(\mu (x)\propto P(D\,|\,x)\) is interpreted as the likelihood function induced by the observation of an event D. Now, if we have a probability distribution on \(x\in \mathcal {X}\), depending on an unknown parameter \(\theta \in \varTheta \), then the observation of the event D induces also a likelihood function lik on \(\varTheta \):
for all \(\theta \in \varTheta \), where \(P(D\,|\,x)\) is assumed to be a measurable function of x that does not depend on \(\theta \).
Zadeh [35] defined the probability of the fuzzy event described by a membership function \(\mu :\mathcal {X}\rightarrow [0,1]\) as the right-hand side of (1), without justifying this choice through a clear interpretation of the values of \(\mu \). The likelihood interpretation provides only a partial justification: the right-hand side of (1) is proportional to the probability of the event D that induced the fuzzy information described by \(\mu \), where the proportionality constant can depend on anything but \(\theta \) (or x).
In [35] Zadeh introduced also the concept of probabilistic independence for fuzzy events, again without a clear justification. The likelihood interpretation clarifies another concept of independence, which is extremely important in fuzzy set theory: the concept of independence among the pieces of information described by different fuzzy sets, which is usually implicitly or explicitly assumed [3, 24]. The pieces of information described by the membership functions \(\mu _{1} ,\ldots ,\mu _{n}:\mathcal {X}\rightarrow [0,1]\) with \(\mu _{i}(x)\propto P(D_{i}\,|\,x)\) can be interpreted as independent when the events \(D_{1},\ldots ,D_{n}\) that induced them were conditionally independent given x. In this case, the joint fuzzy information is described by the membership function \(\mu :\mathcal {X}\rightarrow [0,1]\) with
for all \(x\in \mathcal {X}\), where \(D=D_{1}\cap \cdots \cap D_{n}\).
In particular, if \(\mathcal {X}=\mathcal {X}_{1}\times \cdots \times \mathcal {X}_{n}\), the components \(x_{i}\) of \(x=(x_{1},\ldots ,x_{n})\) are probabilistically independent (for all \(\theta \)), and each piece of fuzzy information \(\mu _{i}(x_{i})\propto P(D_{i}\,|\,x)\) is about a different component of x, then the assumption of their independence is very natural, and by combining (1) and (2) we obtain
for all \(\theta \in \varTheta \). This likelihood function has been considered by several authors [13, 20, 23, 26], but only justified on the basis of Zadeh’s rather arbitrary definition of the probability of a fuzzy event [35].
The likelihood function (3) induced by fuzzy data with membership functions \(\mu _{i}:\mathcal {X}_{i}\rightarrow [0,1]\) is often too complex to be handled analytically [20], but this is nowadays a typical situation in the likelihood and Bayesian approaches to statistics. In particular, \(x_{1},\ldots ,x_{n}\) play the role of unobserved variables in (3), and therefore the EM algorithm can be used to maximize the likelihood [13]. Several examples of numerical calculations of maximum likelihood estimates based on fuzzy data are given for instance in [13, 23].
When the data are fuzzy numbers, in the sense that \(\mathcal {X}_{i} \subseteq \mathbb {R}\), the likelihood function (3) can also be interpreted as resulting from an errors-in-variables model or measurement error model [5]. In this case, the value \(\xi _{i}\) of a proxy \(x_{i}^{*}\) is assumed to be observed instead of the value of the variable \(x_{i}\), where \(\xi _{i}\in \mathbb {R}\) is an arbitrarily chosen constant, while the measurement error \(\varepsilon _{i}=x_{i}^{*}-x_{i}\) is random with density \(f_{i}\propto \mu _{i}(\xi _{i}-\,\cdot \,)\) and independent of everything else. In this model, each fuzzy number \(\mu _{i}(x_{i})\propto f_{i}(\xi _{i}-x_{i})\propto lik(x_{i}\,|\,x_{i}^{*}=\xi _{i})\) describes the information about the unknown value of \(x_{i}\) obtained from the observed value of its proxy \(x_{i}^{*}\), and the likelihood function \(lik(\,\cdot \,|\,x_{1}^{*}=\xi _{1},\,\ldots ,\,x_{n}^{*}=\xi _{n})\) on \(\varTheta \) induced by these observations is the one in (3). The description of fuzzy data in terms of measurement errors is particularly useful when the various components combine well mathematically, as in the following simple example.
Example 1
Assume that \(x_{1,}\ldots ,x_{n}\) is a sample from a normal distribution with known variance \(\sigma ^{2}\) and unknown expectation \(\theta \in \mathbb {R}\), but we have only fuzzy data with membership functions , where \(\xi _{i},\sigma _{i}\) are known constants. Then the proxy variables \(x_{1}^{*},\ldots ,x_{n}^{*}\) are independent, and each \(x_{i}^{*}\) is normally distributed with expectation \(\theta \) and variance \(\sigma ^{2}+\sigma _{i}^{2} \). Hence, the likelihood function induced by the fuzzy data is given by
for all \(\theta \in \mathbb {R}\), where the maximum likelihood estimate \(\hat{\theta }\) is the weighted average of the centers \(\xi _{i}\) of the fuzzy numbers, with weights depending on their precision , while is the precision of \(\hat{\theta }\) (which is normally distributed with expectation \(\theta \) and variance \(\tau ^{2}\)).
Besides the maximum likelihood estimate \(\hat{\theta }\), for each \(\alpha \in (0,1)\) we obtain a likelihood-based confidence interval for \(\theta \):
with exact level \(F_{\chi _{1}^{2}}(-2\,\ln \alpha )\), where \(F_{\chi _{1}^{2}}\) is the cumulative distribution function of the chi-squared distribution with 1 degree of freedom. Alternatively, we can combine the likelihood function (4) induced by the fuzzy data with a Bayesian prior, and base our conclusions on the resulting posterior. In particular, if the prior is a normal distribution with expectation \(\theta _{0}\) and variance \(\tau _{0}^{2}\), then the posterior is a normal distribution with expectation \(\theta _{1}\) and variance \(\tau _{1}^{2}\), where \(\theta _{1}\) is the weighted average of \(\theta _{0}\) and \(\hat{\theta }\), with weights proportional to their precision and , respectively, while these add up to the posterior precision .
4 Fuzzy Inference
Besides allowing the direct use of fuzzy data in statistical methods, the likelihood interpretation of fuzzy sets also leads naturally to fuzzy statistical inference. In fact, the likelihood function on \(\varTheta \) induced by the (fuzzy or crisp) data can be interpreted as the membership function \(\mu :\varTheta \rightarrow [0,1]\) of a (normalized) fuzzy set describing the information obtained from the data about the unknown value of the parameter \(\theta \in \varTheta \).
In particular, the likelihood-based confidence intervals (or regions) for \(\theta \), defined as in the left-hand side of (5) for all \(\alpha \in (0,1)\), correspond to the \(\alpha \)-cuts of the fuzzy set with membership function \(\mu \). Both likelihood-based confidence intervals and \(\alpha \)-cuts are usually defined using the non-strict inequality, but the choice of the strict inequality in the definition provides a better agreement with the concept of profile likelihood function [9], which is of central importance in the likelihood approach to statistics, and corresponds to the extension principle [36], which is equally central in fuzzy set theory.
A correspondence between \(\alpha \)-cuts and (general) confidence intervals has also been suggested as an alternative interpretation of some fuzzy sets [4, 28]. However, this interpretation is afflicted by the fact that confidence intervals are rather arbitrary constructs, and in particular do not usually satisfy the extension principle, when they are not likelihood-based confidence intervals. The interpretation of fuzzy sets in terms of likelihood-based confidence intervals (i.e. the likelihood interpretation) has the advantage of uniqueness, invariance, and general applicability, although a simple expression for the confidence level based on the chi-squared distribution, as in Example 1, is valid (exactly or asymptotically) only under some regularity conditions [33].
Since each value of \(\theta \in \varTheta \) corresponds to a probability measure \(P(\,\cdot \,|\,\theta )\), a fuzzy set with membership function \(\mu :\varTheta \rightarrow [0,1]\) can also be interpreted as a fuzzy probability measure [6, 7]. This likelihood-based model of fuzzy probability bears important similarities to the Bayesian model of probability, and can be used as a basis for statistical inference and decision making [6–8].
5 Conclusion
In this paper, the likelihood interpretation of fuzzy sets has been reviewed and some of its consequences analyzed. Not surprisingly, with this interpretation fuzzy data and fuzzy inferences can be easily incorporated in statistical methods. In particular, the likelihood interpretation of fuzzy data justifies the use of expression (3) for the induced likelihood function, and establishes a fruitful connection with errors-in-variables models or measurement error models, as illustrated by Example 1. Furthermore, the link between this interpretation and the likelihood approach to statistics sheds some light on the central role played by extension principle and \(\alpha \)-cuts in fuzzy set theory.
The theory of fuzzy sets is also a theory of information fusion. However, only the product rule \(\mu (x)\propto \prod _{i=1}^{n}\mu _{i}(x)\) for the conjunction of independent pieces of information is directly justified by the likelihood interpretation (2). The rules for other logical connectives, with or without the independence assumption, can be obtained through the concept of profile likelihood (i.e. the extension principle). For example, the conjunction without independence assumption is then given by the minimum rule \(\mu (x)\propto \bigwedge _{i=1}^{n}\mu _{i}(x)\), while negation always results in the vacuous membership function \(\mu \equiv 1\). Such rules, which are a consequence of the likelihood interpretation of fuzzy sets, will be the topic of future work.
References
Bilgiç T, Türkşen IB (2000) Measurement of membership functions: theoretical and empirical work. In: Prade H, Dubois D (eds) Fundamentals of fuzzy sets. Springer, pp 195–230
Black M (1937) Vagueness. Philos Sci 4:427–455
Bradley J (2009) Fuzzy logic as a theory of vagueness: 15 conceptual questions. In: Seising R (ed) Views on fuzzy sets and systems from different perspectives. Springer, pp 207–228
Buckley JJ (2006) Fuzzy probability and statistics. Springer, New York
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models, 2nd edn. Chapman & Hall/CRC
Cattaneo M (2008) Fuzzy probabilities based on the likelihood function. In: Lubiano MA, Prade H, Gil MÁ, Grzegorzewski P, Hryniewicz O, Dubois D (eds) Soft methods for handling variability and imprecision. Springer, New York, pp 43–50
Cattaneo M (2009) A generalization of credal networks. In: Augustin T, Coolen FPA, Moral S, Troffaes MCM (eds) ISIPTA ’09, SIPTA, pp 79–88
Cattaneo M (2013) Likelihood decision functions. Electron J Stat 7:2924–2946
Cattaneo M, Wiencierz A (2012) Likelihood-based imprecise regression. Int J Approx Reason 53:1137–1154
Coletti G, Scozzafava R (2004) Conditional probability, fuzzy sets, and possibility: a unifying view. Fuzzy Sets Syst 144:227–249
Coletti G, Vantaggi B (2010) From comparative degrees of belief to conditional measures. In: Squillante M, Yager RR, Kacprzyk J, Greco S, Marques Pereira RA (eds) Preferences and decisions. Springer, pp 69–84
Coletti G, Vantaggi B (2013) Inference with probabilistic and fuzzy information. In: Seising R, Trillas E, Moraga C, Termini S (eds) On fuzziness, vol 1. Springer, pp 115–119
Denœux T (2011) Maximum likelihood estimation from fuzzy data using the EM algorithm. Fuzzy Sets Syst 183:72–91
Dubois D (2006) Possibility theory and statistical reasoning. Comput Stat Data Anal 51:47–69
Dubois D, Prade H (2012) Gradualness, uncertainty and bipolarity: making sense of fuzzy sets. Fuzzy Sets Syst 192:3–24
Dubois D, Moral S, Prade H (1997) A semantics for possibility theory based on likelihoods. J Math Anal Appl 205:359–380
Dubois D, Nguyen HT, Prade H (2000) Possibility theory, probability and fuzzy sets. In: Prade H, Dubois D (eds) Fundamentals of fuzzy sets. Springer, pp 343–438
Edwards AWF (1974) The history of likelihood. Int Stat Rev 42:9–15
Fisher RA (1921) On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1:3–32
Gil MÁ, Casals MR (1988) An operative extension of the likelihood ratio test from fuzzy data. Stat Pap 29:191–203
Hald A (1999) On the history of maximum likelihood in relation to inverse probability and least squares. Stat Sci 14:214–222
Hisdal E (1988) Are grades of membership probabilities? Fuzzy Sets Syst 25:325–348
Jung HY, Lee WJ, Yoon JH, Choi SH (2014) Likelihood inference based on fuzzy data in regression model. In: SCIS & ISIS 2014, IEEE, pp 1175–1179
Kovalerchuk B (2014) Probabilistic solution of Zadeh’s test problems. In: Laurent A, Strauss O, Bouchon-Meunier B, Yager RR (eds) Information processing and management of uncertainty in knowledge-based systems, vol 2. Springer, pp 536–545
Lindley DV (2004) Comment to [31]. J Am Stat Assoc 99:877–879
Liu X, Li S (2013) Cumulative distribution function estimation with fuzzy data: Some estimators and further problems. In: Berthold MR, Moewes C, Gil MÁ, Grzegorzewski P, Hryniewicz O, Kruse R (eds) Synergies of soft computing and statistics for intelligent data analysis. Springer, pp 83–91
Loginov VI (1966) Probability treatment of Zadeh membership functions and their use in pattern recognition. Eng Cybern 4:68–69
Mauris G (2008) Inferring a possibility distribution from very few measurements. In: Lubiano MA, Prade H, Gil MÁ, Grzegorzewski P, Hryniewicz O, Dubois D (eds) Soft methods for handling variability and imprecision. Springer, pp 92–99
Menger K (1951) Ensembles flous et fonctions aléatoires. C R Acad Sci 232:2001–2003
Scozzafava R (2013) The membership of a fuzzy set as coherent conditional probability. In: Seising R, Trillas E, Moraga C, Termini S (eds) On fuzziness, vol 2. Springer, pp 631–635
Singpurwalla ND, Booker JM (2004) Membership functions and probability measures of fuzzy sets. J Am Stat Assoc 99:867–877
Walley P (1991) Statistical reasoning with imprecise probabilities. Chapman and Hall
Wilks SS (1938) The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann Math Stat 9:60–62
Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353
Zadeh LA (1968) Probability measures of fuzzy events. J Math Anal Appl 23:421–427
Zadeh LA (1975) The concept of a linguistic variable and its application to approximate reasoning. Inf Sci 8:199–249, 8:301–357, 9:43–80
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this paper
Cite this paper
Cattaneo, M.E.G.V. (2017). The Likelihood Interpretation of Fuzzy Data. In: Ferraro, M., et al. Soft Methods for Data Science. SMPS 2016. Advances in Intelligent Systems and Computing, vol 456. Springer, Cham. https://doi.org/10.1007/978-3-319-42972-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-42972-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42971-7
Online ISBN: 978-3-319-42972-4
eBook Packages: EngineeringEngineering (R0)