Abstract
The lecture presents a new, non-statistical approach to the analysis and construction of similarity, dissimilarity and correlation measures. The measures are considered as functions defined on an underlying set and satisfying the given properties. Different functional structures, relationships between them and methods of their construction are discussed. Particular attention is paid to functions defined on sets with an involution operation, where the class of (strong) correlation functions is introduced. The general methods constructing new correlation functions from similarity and dissimilarity functions are considered. It is shown that the classical correlation and association coefficients (Pearson’s, Spearman’s, Kendall’s, Yule’s Q, Hamann) can be obtained as particular cases.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- Similarity measure
- Pearson’s product-moment correlation
- Spearman’s rank correlation
- Kendall’s rank correlation
- Yule’s Q
1 Introduction
Data Science is the buzzword of the last decade. One of the goals of Data Science is to extract knowledge, insights and usable information from various types of data. Data Science courses typically include the topics on statistical data analysis, machine learning methods and data mining. Correlation, association, similarity, relationship and interestingness coefficients or measures play an important role in data analysis, classification tasks and data mining. Dozens of such measures have been created for various types of data, and articles on these indicators are cited in hundreds and thousands of papers [3, 13, 14, 20, 23,24,25, 27]. Such measures are defined for dichotomous or real valued variables measured for sampling units, for rankings, binary or real valued vectors of attributes, time series, fuzzy sets of different types, probabilistic distributions, images, rating profiles etc. These measures used in information retrieval, data classification, machine learning, analysis of relationships and decision making in ecology, computational linguistics, image and signal processing, financial data analysis, bioinformatics and social sciences. Often, measures introduced for one type of data give misleading results when they are used for analysis of relationships between data of another type [8]. There is a need in construction of a general theory of similarity, correlation and association measures, which will be able to analyze general properties of these measures independent on domain or for large class of data types. Such a theory will make it possible to transfer the methods of constructing association measures from one domain to another and propose such measures for new data types.
The first steps in construction of the general theory of similarity, dissimilarity and correlation measures have been done in [7, 9, 10] where the fundamental results on definition and construction of association measures (correlation coefficients) on sets with involution operations have been proposed. It was shown that many known correlation coefficients can be considered as functions defined on an underlying set with involution operation and satisfying a simple set of properties. The general methods of construction of these functions from suitable similarity or dissimilarity functions have been proposed. These results can be considered as an alternative solution of the Kendall’s problem of construction of general correlation coefficient [22]. Kendall proposed a formula that gives as particular cases Pearson’s product-moment correlation, Spearman’s and Kendall’s rank correlation coefficients but it was not clear how to obtain from this formula other known association and correlation coefficients and how to build correlation coefficients for other domains. The approach proposed in [7, 9, 10] gives possibility to construct the classical correlation coefficients that can be obtained from Kendall’s formula, the Yule’s Q and Hamann coefficients together with some other association coefficients for binary data and gives a tool for constructing new correlation coefficients on new domains.
Similarity, dissimilarity and correlation measures or coefficients are considered in this lecture as functions defined on an underlying set \( \Omega \) and satisfying some reasonable sets of properties. Generally, these functions can be considered as association functions, measuring (may be non-statistical) associations or relationships between objects. The methods of construction of such functions, the transformations of them and relationships between them studied in [7, 9, 10]. This paper presents a short and updated description of some part of author’s Lecture on RAAI Summer School 2019. Due to the limit on the size of the paper it was not possible to include all topics discussed in [9] and in this lecture. The presented paper can be considered as complementary to the [9]. It includes new methods of construction of correlation functions presented recently on INES 2019 [10] and contains more examples of (dis)similarity and correlation functions illustrating these methods. The list of references includes only several works. Hundreds or thousands papers have been published on related topics during more than one hundred years. Of course it was not possible to include most of them. Some useful references can be found in the papers cited in the list of references.
2 Examples of Similarity Measures and Correlation Coefficients for Different Domains
2.1 Binary n-Tuples
Consider n-tuple \( x = \left( {x_{1} , \ldots ,x_{n} } \right) \), \( x_{i} \in \left\{ {0,1} \right\}, i = 1, \ldots ,n \). It can represent an object x described by n binary features, attributes or properties such that \( x_{i} = 1 \) if the object x possesses the i-th attribute and \( x_{i} = 0 \) in the opposite case. Denote X the set of attributes possessed by x. In statistics, a binary n-tuple can denote n measurements of a dichotomous variable or property x for n sampling units. \( x_{i} = 1 \) denotes that the property x fulfilled for the unit i, and \( x_{i} = 0 \) otherwise. We use here the first interpretation of binary n-tuples. For two binary n-tuples \( x = \left( {x_{1} , \ldots ,x_{n} } \right), \) \( y = \left( {y_{1} , \ldots ,y_{n} } \right) \) denote:
-
a the number of attributes possessed by both objects x and y, when \( x_{i} = y_{i} = 1 \);
-
b the number of attributes possessed by x and not by y, when \( x_{i} = 1, y_{i} = 0 \);
-
c the number of attributes possessed by y and not by x, when \( x_{i} = 0, y_{i} = 1 \);
-
d the number of attributes not possessed by both objects x and y, when \( x_{i} = y_{i} = 0. \)
The numbers a and d usually referred to as the numbers of positive and negative matches, respectively. Below are examples of similarity and association measures for binary n-tuples [13]:
Jaccard similarity measure:
Simple Matching similarity measure:
Hamann coefficient:
Yule’s Q association coefficient:
2.2 Real Valued n-Tuples
Consider n-tuples \( x = \left( {x_{1} , \ldots ,x_{n} } \right), \) \( y = \left( {y_{1} , \ldots ,y_{n} } \right) \) with real valued components.
Cosine similarity measure:
where \( x_{i} , y_{i} \ge 0 \), \( i = 1, \ldots ,n \).
Pearson’s product-moment correlation coefficient:
where \( \bar{x} = \frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} x_{i} \), \( \bar{y} = \frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} y_{i}. \)
For both measures, it is supposed that denominators do not equal to zero.
2.3 Rankings
Consider two rankings of n objects \( x = \left( {x_{1} , \ldots ,x_{n} } \right), \) \( y = \left( {y_{1} , \ldots ,y_{n} } \right) \) containing n different integer ranks, \( 1 \le x_{i} ,y_{i} \le n \), i.e. “without ties”.
Spearman’s rank correlation coefficient [12, 18, 22]:
where \( d_{i} = x_{i} - y_{i} , i = 1, \ldots ,n \).
Kendall’s rank correlation coefficient [12, 18, 22] for real valued n-tuples \( x = \left( {x_{1} , \ldots ,x_{n} } \right), \) \( y = \left( {y_{1} , \ldots ,y_{n} } \right) \) without ties, i.e. \( x_{i} \ne x_{j} \), \( y_{i} \ne y_{j} \) for all \( i,j \):
where NC is the number of concordant pairs \( \left( {i,j} \right) \) calculated as follows:
and ND is the number of disconcordant pairs calculated as follows:
Note that only the ordering of the component values of n-tuples is used.
2.4 Finite Probabilistic Distributions
Suppose \( x = \left( {x_{1} , \ldots ,x_{n} } \right) \) is a finite probabilistic distribution, i.e. \( x_{i} \ge 0 \), for all \( i = 1, \ldots ,n \), and \( \mathop \sum \limits_{i = 1}^{n} x_{i} = 1 \).
Bhattacharyya coefficient is defined as follows [1]:
2.5 Kendall’s General Correlation Coefficient
Kendall [22] considered the problem of constructing general correlation coefficient. For two n-tuples \( x = \left( {x_{1} , \ldots ,x_{n} } \right) \) and \( y = \left( {y_{1} , \ldots ,y_{n} } \right) \) he defined it as:
where the values \( a_{ij} \) calculated from the values \( x_{i} \) and \( x_{j} \), similarly \( b_{ij} \) calculated from \( y_{i} \) and \( y_{j} \), such that the following conditions are fulfilled: \( a_{ij} = - a_{ji} \), \( b_{ij} = - b_{ji} \), \( a_{ii} = b_{ii} = 0 \) for all \( i,j = 1, \ldots ,n \). As particular cases of this formula he obtained Spearman’s and Kendall’s rank correlation coefficients, and Pearson’s product-moment correlation coefficient. How to obtain other correlation coefficients from this formula does not clear.
3 Similarity and Dissimilarity Functions
The examples of similarity measures and correlation coefficients considered in the previous section show that there is a variety of such indicators for different types of data. Generally, there are introduced tens of such measures and coefficients [13, 14, 27]. To analyze the general properties of these measures, possible relationships between them and to study the methods of their construction in [9] it was proposed to consider these measures as functions defined on a universal domain \( \Omega \) and satisfying some simple sets of properties. Three main types of such functions have been considered: similarity, dissimilarity and correlation functions, which can be used as models of similarity, dissimilarity and correlation measures and coefficients, respectively. Below we introduce these functions and consider some of their properties. More results can be found in [9].
Further, for brevity, similarity and dissimilarity functions will be referred to as (dis)similarity functions or as resemblance functions.
3.1 Definition of Similarity, Dissimilarity and Correlation Functions
Let \( \Omega \) be a nonempty set that will be referred to as a universal domain or an underlying set in definition of similarity and correlation functions. As \( \Omega \) we can consider domain specific for a considered data type: the set of all binary n-tuples, the set of all real valued vectors of the length n, the set of all fuzzy sets defined on some domain X, the set of some images or objects considered in some problem, etc.
A function \( S:\Omega \times\Omega \to \left[ {0,1} \right] \) is called a similarity function on \( \Omega \) if for all x, y in \( \Omega \) it is symmetric:
and reflexive:
A function \( D:\Omega \times\Omega \to \left[ {0,1} \right] \) is a dissimilarity function on \( \Omega \) if for all x, y in \( \Omega \) it is symmetric:
and irreflexive:
These functions are called complementary if for all x, y in \( \Omega \) it is fulfilled:
For complementary (dis)similarity functions we have:
A function \( A:\Omega \times\Omega \to \left[ { - 1,1} \right] \) is a correlation function (association measure) on \( \Omega \) if for all x, y in \( \Omega \) it is symmetric:
reflexive:
and negative: \( A\left( {x,y} \right) < 0 \) for some x, y in \( \Omega \).
Such correlation functions will be referred to as weak correlation functions if they will not satisfy inverse relationship property considered in Sect. 4. In Sect. 4 we define a strong (invertible) correlation function on a set \( \Omega \) with involution operation.
It is easy to see that the indicators considered in the previous section belong to the following classes of functions:
-
Similarity functions: Jaccard, Simple Matching, Cosine (if all \( x_{i} \) and \( y_{i} \) have non-negative values) similarity measures and Bhattacharyya coefficient;
-
Correlation functions: Hamann and Yule’s Q coefficients, Pearson’s product-moment correlation coefficient, Spearman’s and Kendall’s rank correlation coefficients, Cosine (if \( x_{i} \) and \( y_{i} \) can have positive and negative values).
3.2 Examples of Complementary (Dis)Similarity Functions
Jaccard (see Sect. 2.1):
Simple Matching (Sect. 2.1):
Cosine (Sect. 2.2):
Bhattacharyya (Sect. 2.4):
The last dissimilarity function is called Hellinger discrimination, Matusita measure or Squared-Chord distance.
3.3 Fuzzy Relations and Kleene Algebra of Resemblance Functions
Here, similarity and dissimilarity functions will be also referred to as resemblance functions. A resemblance function is a symmetric function \( R:\Omega \times\Omega \to \left[ {0,1} \right] \), which is reflexive or irreflexive. We will say that two resemblance functions have the same type if both are reflexive or both are irreflexive.
Resemblance function R can be considered as a fuzzy relation [2, 17, 28, 29], given by membership function: \( \mu_{R} :\Omega \times\Omega \to \left[ {0,1} \right] \), where \( \mu_{R} \left( {x,y} \right) \) is the strength of the relation R between x and y. The properties of fuzzy relations can be extended on resemblance functions.
Denote \( {\mathcal{P}}\left( S \right),{ \mathcal{P}}\left( D \right) \) and \( {\mathcal{P}}\left( R \right) \) the sets of all similarity, dissimilarity and resemblance functions, respectively, defined on the set \( \Omega \). We have \( {\mathcal{P}}\left( S \right),{ \mathcal{P}}\left( D \right) \subset {\mathcal{P}}\left( R \right) \) and \( {\mathcal{P}}\left( S \right) \cup { \mathcal{P}}\left( D \right) = {\mathcal{P}}\left( R \right) \). Define the operations intersection \( \cap \) and union \( \cup \) of resemblance functions \( R_{1} \) and \( R_{2} \) on the set \( \Omega \) for all x, y in \( \Omega \) as follows:
The set \( {\mathcal{P}}\left( R \right) \) will be a distributive lattice [11], partially ordered by the relation:
The sets \( {\mathcal{P}}\left( S \right) \) and \( {\mathcal{P}}\left( D \right) \) are distributive sublattices of \( {\mathcal{P}}\left( R \right) \). For all similarity functions \( S_{1} ,S_{2} \in {\mathcal{P}}\left( S \right) \) and dissimilarity functions \( D_{1} ,D_{2} \in {\mathcal{P}}\left( D \right) \) it is fulfilled:
Consider similarity and dissimilarity functions defined for all x, y in \( \Omega \) by:
\( D_{0} \) is the least element of \( {\mathcal{P}}\left( D \right) \) and \( {\mathcal{P}}\left( R \right) \), \( S_{1} \) is the greatest element of \( {\mathcal{P}}\left( S \right) \) and \( {\mathcal{P}}\left( R \right) \), \( S_{0} \) is the least element of \( {\mathcal{P}}\left( S \right) \) and \( D_{1} \) is the greatest elements of \( {\mathcal{P}}\left( D \right) \).
The complement \( N\left( R \right) \) of a resemblance function R is defined for all x, y in \( \Omega \) by: \( {N\left( R \right)} \left( {x,y} \right) = 1 - R\left( {x,y} \right) \). This operation is involutive, i.e. for all \( R \) in \( {\mathcal{P}}\left( R \right) \) we have: \( N\left( {N\left( R \right)} \right) = R \). The complement of the similarity and dissimilarity functions will be equal to their complementary dissimilarity and similarity functions (1), respectively:
The lattice \( {\mathcal{P}}\left( R \right) \) with the complement N will be a normal De Morgan (Kleene) algebra [5] where for any resemblance functions \( R_{1} \) and \( R_{2} \) De Morgan laws:
and normality:
are fulfilled.
Entropy of Resemblance Functions.
On the Kleene algebra of resemblance functions one can introduce a measure of non-probabilistic entropy of these functions [5, 9, 15]. Similarity and dissimilarity functions such that \( S\left( {x,y} \right) = D\left( {x,y} \right) = 0.5 \) for all \( x \ne y \) in \( \Omega \), have the maximal entropy and the uncertainty of making decision “x and y are similar” is maximal for such functions [9].
3.4 Min-Transitivity and Hierarchical Clustering
A symmetric and reflexive fuzzy relation \( S:\Omega \times\Omega \to \left[ {0,1} \right] \) is called a fuzzy similarity (fuzzy equivalence) relation [29] if for all x, y, z in \( \Omega \) it satisfies min-transitivity:
A dissimilarity function D complementary to min-transitive similarity function is called an ultrametric and satisfies for all x, y, z in \( \Omega \) the ultrametric inequality:
For any \( \alpha \in \left[ {0,1} \right] \) the \( \alpha \)-cut of fuzzy relation S defines a crisp relation \( S_{\alpha }\, \subseteq\,\Omega \times\Omega \) as follows: \( S_{\alpha } = \left\{ {\left( {x,y} \right) \in\Omega \times\Omega |S\left( {x,y} \right) \ge \alpha } \right\} \). \( \alpha \)-cuts are nested such that from \( \alpha > \beta \) it follows \( S_{\alpha } \, \subseteq\, S_{\beta } \).
Optimal and Invariant Hierarchical Clustering.
All \( \alpha \)-cuts of the min-transitive similarity function (fuzzy equivalence relation) \( E:\Omega \times\Omega \to \left[ {0,1} \right] \) are non-fuzzy equivalence relations; hence they define nested partitions of the set \( \Omega \) on equivalence classes of these relations. These properties of fuzzy equivalence relations give rise to consider hierarchical clustering [21] of the set \( \Omega \) with similarity function S as a min-transitive transformation of this similarity function S into a fuzzy equivalence relation \( E \). Tamura et al. [26] proposed to transform S into its transitive closure \( \hat{S} \) that will be fuzzy equivalence relation. It was shown [16, 19], that this method coincides with a spanning tree clustering and with a version of the single linkage hierarchical clustering algorithm.
Batyrshin [4, 6] showed that the solution of the problem of optimal approximation of similarity function by fuzzy equivalence relation has the form \( E = \widehat{F\left( S \right)} \), i.e. it can be presented as the min-transitive closure of similarity function \( F\left( S \right) \), where F is a “correction” of S such that \( F\left( S \right) \, \subseteq\, S \). In addition, it was studied the problem of construction of invariant hierarchical clustering algorithms which are invariant under monotone transformations of similarity values and under initial numbering (indexing) of objects. The solution of this problem also have been presented in the form \( E = \widehat{F\left( S \right)} \), where \( F\left( S \right) \, \subseteq\, S \). The parametric family of invariant corrections F has been proposed [4, 6].
3.5 Equivalent Resemblance Functions
Two resemblance functions \( R_{1} \) and \( R_{2} \) of the same type defined on the set \( \Omega \) called equivalent (by ordering) [3, 24] if for all x, y, u, v in \( \Omega \) it is fulfilled:
It is clear that two equivalent resemblance functions should have the same type.
A continuous, strictly increasing function \( \varphi : \left[ {0,1} \right] \to \left[ {0,1} \right] \) such that \( \varphi \left( 0 \right) = 0 \) and \( \varphi \left( 1 \right) = 1 \) is called an automorphism of the interval \( \left[ {0,1} \right] \).
Proposition 1.
If R is a resemblance function on \( \Omega \) and \( \varphi \) is an automorphism of the interval [0,1] then the function \( R_{1} \) defined for all x, y in \( \Omega \) by:
will be a resemblance function equivalent to R.
Below there are examples of simplest equivalent transformations of resemblance functions:
For example, instead of dissimilarity function
one can use the equivalent dissimilarity function
Equivalent Resemblance Functions and Invariant Clustering.
Equivalence of (dis)similarity functions supposes that the use of such functions in some classification algorithm will give equivalent results. As such clustering algorithms one can use hierarchical clustering algorithms invariant under monotone transformations of similarity values discussed in previous section.
4 Correlation Functions
4.1 Correlation Functions and Correlation Triplets
In Sect. 3 we introduced the definition of correlation function as follows.
A function \( A:\Omega \times\Omega \to \left[ { - 1,1} \right] \) is a correlation function (association measure) on \( \Omega \) if for all x, y in \( \Omega \) it is symmetric:
reflexive:
and negative: \( A\left( {x,y} \right) < 0 \), for some x, y in \( \Omega \).
Correlation function will be called a weak correlation function if it does not satisfy the inverse relationship property considered below. Here we consider and extend some results introduced in [10].
Proposition 2.
Suppose S and D are similarity and dissimilarity functions on \( \Omega \) such that for some x, y in \( \Omega \) it is fulfilled: \( S\left( {x,y} \right) < D\left( {x,y} \right) \), then the function defined for all x, y in \( \Omega \) by:
is a correlation function. If S and D are complementary then the function A will be a correlation function if for some x, y in Ω it is fulfilled: \( S\left( {x,y} \right) < 0.5 \).
The obtained formula for A has the reasonable interpretation: the correlation between x and y is positive if the similarity between them is greater than the dissimilarity, and the correlation is negative in opposite case.
If the similarity S and dissimilarity D functions are complementary then the correlation function A defined by (2) is called complementary to S and D. Complementary functions S, D and A will be denoted as (S, D, A) and called a correlation triplet. From the definition of the complementary (dis)similarity functions and from (2) it follows that the similarity, dissimilarity and correlation functions from the correlation triplet (S, D, A) can be obtained one from another for all x, y in Ω as follows:
4.2 Examples of Constructing Correlation Functions from (Dis)Similarity Functions
Hamann coefficient \( \varvec{A}_{\varvec{H}} \) (see Sect. 2.1). The Simple Matching similarity measure \( S_{SM} \left( {x,y} \right) = \frac{a + d}{a + b + c + d} \) has the complementary dissimilarity function \( D_{SM} \left( {x,y} \right) = \frac{b + c}{a + b + c + d} \). From (2) we obtain:
Yule’s Q association coefficient \( \varvec{A}_{{\varvec{Y} - \varvec{Q}}} \) (Sect. 2.1). The function \( S_{Y} \left( {x,y} \right) = \frac{ad}{ad + bc} \) is the similarity function and the function \( D_{Y} \left( {x,y} \right) = \frac{bc}{ad + bc} \) is it’s complementary dissimilarity function. From (2) we obtain:
Note that similarly to Yule’s Q association and Hamann coefficients it is easy to construct the most of correlation functions considered for binary data [13].
Pearson’s product-moment correlation coefficient r (Sect. 2.2). The function
is the dissimilarity function. From (5) obtain Pearson’s product-moment correlation coefficient:
Spearman’s rank correlation coefficient \( \rho \) (see Sect. 2.3). Consider the function:
It satisfies the properties of dissimilarity functions and from (5) we obtain the Spearman’s rank correlation coefficient: \( A\left( {x,y} \right) = 1 - 2D\left( {x,y} \right) = 1 - \frac{{6\mathop \sum \nolimits_{i = 1}^{n} d_{i}^{2} }}{{n\left( {n^{2} - 1} \right)}} = \rho . \)
Kendall’s rank correlation coefficient \( \varvec{\tau} \) (Sect. 2.3). Consider the functions:
They are the complementary similarity and dissimilarity functions, respectively, such that \( S\left( {x,y} \right) + D\left( {x,y} \right) = 1 \), and from (2) we obtain \( A\left( {x,y} \right) = \frac{NC - ND}{{n\left( {n - 1} \right)/2}} = \tau \).
4.3 Strong (Invertible) Correlation Functions on the Sets with Involution Operation
Initially the correlation function (association measure) was defined on the set with involution operation [7] as function satisfying inverse relationship property considered below. Such correlation functions will be called here strong or invertible correlation functions. It is surprising that all correlation functions considered above are invertible. For this reason, the correlation function which is not invertible will be called a weak correlation function.
A function \( N:\Omega \to\Omega \) is called a reflection or a negation on \( \Omega \) if it satisfies for all x in \( \Omega \) the involutivity property:
and if it is not an identity function, i.e. for some x in \( \Omega \) it is fulfilled: \( N\left( x \right) \ne x \).
An element x in \( \Omega \) such that \( N\left( x \right) = x \) is called a fixed point and the set of all fixed points of the reflection N on \( \Omega \) is denoted as \( FP\left( {N,\Omega } \right) \) or \( FP\left(\Omega \right) \).
Definition 1.
[7] Let N be a reflection on \( \Omega \) and V be a subset of \( {\varOmega \setminus }FP\left(\Omega \right) \) closed under N. A strong correlation function (association measure) on V is a function \( A:V \times V \to \left[ { - 1,1} \right] \) satisfying for all x, y in V the properties:
The strong correlation function also will be referred to as an invertible correlation function.
Theorem 1.
The correlation function A from a correlation triplet (S, D, A) is invertible if and only if the complementary similarity and dissimilarity functions satisfy the following properties:
These properties will be called bipolarity properties and corresponding functions S and D will be called bipolar, see [8, 10]. The value 1 equals to the sum of the pole values 0 and 1 of the interval [0, 1] of similarity and dissimilarity values. It is clear that S is bipolar if and only if its complementary dissimilarity function D is bipolar.
Similarly, the property of inverse relationship of correlation function can be written in the form of bipolarity:
taking into account that \( 0 = - 1 + 1 \), i.e. zero equals to the sum of the pole values of the interval \( \left[ { - 1,1} \right] \) of correlation values. With this terminology the Theorem 1 can be formulated as follows: The correlation function A from a correlation triplet (S, D, A) is bipolar if and only if the similarity and dissimilarity functions S and D are bipolar.
The proof follows, for example, from (4): \( A\left( {x,y} \right) + A\left( {x,N\left( y \right)} \right) = 2S\left( {x,y} \right) - 1 + 2S\left( {x,N\left( y \right)} \right) - 1 = 2\left( {S\left( {x,y} \right) + S\left( {x,N\left( y \right)} \right) - 1} \right) \), and the first sum equals to zero if and only the last sum equals to zero.
For complementary (dis)similarity functions we have: \( S\left( {x,y} \right) + D\left( {x,y} \right) = 1 \) and bipolarity of these functions is equivalent to the properties:
Hence to prove the inverse relationship property of the correlation function A it is sufficient to show the fulfillment of the bipolarity or (6) properties for the (dis)similarity functions S or D used in construction of A by means of (2), (4) or (5).
One can show that all correlation functions considered in Sect. 4.2 are invertible with respect to suitable involutions. See some results in [10].
4.4 Constructing Strong Correlation Functions from Co-symmetric (Dis)Similarity Functions
Similarity S and dissimilarity D functions are consistent on the set \( \Omega \) with involution N if for all x in \( \Omega \) it is, respectively, fulfilled [9]:
Resemblance function R is co-symmetric on the set \( \Omega \) with involution N if for all x, y in \( \Omega \) it is fulfilled [9]:
It was shown [7, 9] that a resemblance function R is co-symmetric if and only if for all x, y in \( \Omega \) it is fulfilled the following property:
Proposition 3
[7]. Invertible correlation function is co-symmetric.
Proposition 4
[10]. Bipolar resemblance function is consistent and co-symmetric.
Theorem 1 says that for construction of invertible correlation function from its complementary (dis)similarity functions by (2), (4) or (5) we need to have bipolar (dis)similarity functions. To construct such functions for specific domain is not always easy. From Proposition 4, one can conclude that consistent and co-symmetric (dis)similarity functions may be not so restrictive than bipolar functions, and it is easier to construct such (dis)similarity functions than bipolar functions. The following theorem shows how to use them for constructing invertible correlation functions.
Theorem 2
[7]. Let N be a reflection on \( \Omega \) and V be a nonempty subset of \( \Omega \backslash FP\left(\Omega \right) \) closed under reflection N. Let \( S:V \times V \to \left[ {0,1} \right] \) be a co-symmetric and consistent similarity function, then the function \( A:V \times V \to \left[ { - 1,1} \right] \) defined for all x, y in V by:
is a strong correlation function on V.
The formula (7) has the simple interpretation: the correlation between x and y is positive if x is more similar to y than to its negation and the correlation is negative in the opposite case.
More general methods of constructing invertible correlation functions (association measures) have been proposed in [7, 9, 10]. These methods instead of difference operation in (7) use pseudo-difference operations and instead of consistent similarity functions in (7) they can use similarity functions satisfying weaker conditions.
5 Conclusion and Future Directions of Research
This work presents a short and updated version of a part of author’s Lecture on RAAI Summer School 2019. The presented paper can be considered as complementary to the [9]. It includes new methods of construction of correlation functions presented recently on INES 2019 [10] and contains more examples of (dis)similarity and correlation functions illustrating these methods. As a future work, it is supposed to extend the developed approach on other types of association and relationships measures and on other domains.
References
Aherne, F.J., Thacker, N.A., Rockett, P.I.: The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34, 363–368 (1998)
Averkin, A.N., Batyrshin, I.Z., Blishun, A.F., Silov, V.B., Tarasov, V.B.: Fuzzy sets in models of control and artificial intelligence. Pospelov, D.A. (ed.) Nauka, Moscow (1986). (in Russian)
Batagelj, V., Bren, M.: Comparing resemblance measures. J. Classif. 12, 73–90 (1995)
Batyrshin, I.Z.: Methods of system analysis based on weighted relations, Ph.D. dissertation. Moscow Power Engineering Institute, Moscow (1982). (in Russian)
Batyrshin, I.Z.: On fuzzinesstic measures of entropy on Kleene algebras. Fuzzy Sets Syst. 34, 47–60 (1990)
Batyrshin, I., Rudas, T.: Invariant hierarchical clustering schemes. In: Batyrshin, I., Kacprzyk, J., Sheremetov, L., Zadeh, L.A. (eds.) Perception-Based Data Mining and Decision Making in Economics and Finance, pp. 181–206. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-36247-0_7
Batyrshin, I.Z.: On definition and construction of association measures. J. Intell. Fuzzy Syst. 29, 2319–2326 (2015)
Batyrshin, I., Monroy-Tenorio, F., Gelbukh, A., Villa-Vargas, L.A., Solovyev, V., Kubysheva, N.: Bipolar rating scales: a survey and novel correlation measures based on non-linear bipolar scoring functions. Acta Polytechnica Hungarica 14, 33–57 (2017)
Batyrshin, I.: Towards a general theory of similarity and association measures: similarity, dissimilarity and correlation functions. J. Intell. Fuzzy Syst. 36(4), 2977–3004 (2019)
Batyrshin, I.Z.: Constructing correlation coefficients from similarity and dissimilarity functions. In: INES 2019, IEEE 23rd IEEE International Conference on Intelligent Engineering Systems, Hungary, 25–27 April. IEEE, Gödöllő (2019)
Birkhoff, G.: Lattice Theory, 3rd edn. American Mathematical Society, Providence (1967)
Chen, P.Y., Popovich, P.M.: Correlation: Parametric and Nonparametric Measures. Sage, Thousand Oaks (2002)
Choi, S.S., Cha, S.H., Charles, C.T.: A survey of binary similarity and distance measures. J. Syst. Cybern. Inform. 8, 43–48 (2010)
Clifford, H.T., Stephenson, W.: An Introduction to Numerical Classification. Academic Press, New York (1975)
De Luca, A., Termini, S.: A definition of a nonprobabilistic entropy in the setting of fuzzy sets. Inform. Control 20, 301–312 (1972)
Dunn, J.C.: A graph theoretic analysis of pattern classification via Tamura’s fuzzy relation. IEEE Trans. Syst. Man Cybern. 3, 310–313 (1974)
Fodor, J.C., Roubens, M.R.: Fuzzy Preference Modelling and Multicriteria Decision Support, vol. 14. Springer, Dordrecht (1994). https://doi.org/10.1007/978-94-017-1648-2
Gibbons, J.D., Chakraborti, S.: Nonparametric Statistical Inference, 4th edn. Dekker, New York (2003)
Gower, J.C., Ross, G.J.S.: Minimum spanning trees and single linkage cluster analysis. Appl. Stat. 18, 54–64 (1969)
Janson, S., Vegelius, J.: Measures of ecological association. Oecologia 49, 371–376 (1981)
Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 32, 241–254 (1967)
Kendall, M.G.: Rank Correlation Methods, 4th edn. Griffin, London (1970)
Legendre, P., Legendre, L.F.: Numerical Ecology, 2nd edn. Elsevier, Amsterdam (1998). English edn.
Lesot, M-J., Rifqi, M., Benhadda, H.: Similarity measures for binary and numerical data: a survey. Int. J. Knowl. Eng. Soft Data Paradigms 1, 63–84 (2009)
Rauschenbach, G.V.: Proximity and similarity measures. In: Analysis of Non-Numerical Information in Sociological Research, Nauka, Moscow, pp. 169–202 (1985). (in Russian)
Tamura, S., Higuchi, S., Tanaka, K.: Pattern classification based on fuzzy relations. IEEE Trans. Syst. Man Cybern. 1, 61–66 (1971)
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: 8th Proceedings of Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 32–41 (2002)
Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)
Zadeh, L.A.: Similarity relations and fuzzy orderings. Inf. Sci. 3, 177–200 (1971)
Acknowledgements
This works partially supported by the project SIP 20196374 IPN and by Organizing Committee of RAAI Summer School. The author thanks all organizers of RAAI Summer School and editors of this book. Special thanks to doctors Gennady Osipov, Alexander Panov and Maria Koroleva.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Batyrshin, I.Z. (2019). Data Science: Similarity, Dissimilarity and Correlation Functions. In: Osipov, G., Panov, A., Yakovlev, K. (eds) Artificial Intelligence. Lecture Notes in Computer Science(), vol 11866. Springer, Cham. https://doi.org/10.1007/978-3-030-33274-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-33274-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33273-0
Online ISBN: 978-3-030-33274-7
eBook Packages: Computer ScienceComputer Science (R0)