1 Introduction

Since it was proposed by Zadeh [1], the theory of fuzzy set (FS) has achieved a great success due to its capability of handling uncertainty [2, 3]. Therefore, over the last decades, several higher order fuzzy sets have been introduced in the literature. Intuitionistic fuzzy (IF) set, as one of the higher order FSs, was proposed by Atanassov [4] to deal with vagueness. The main advantage of IF set is its capability on coping with the uncertainty that may exist due to information impression. Because it assigns to each element a membership degree, a non-membership degree and a hesitation degree, and thus, IF set constitutes an extension of Zadeh’s fuzzy set which only assigns to each element a membership degree, taking 1 minus it as the degree of non-membership [5]. So IF set is regarded as a more effective way to deal with vagueness than FS [6].

Similarity measure is of great significance in almost every scientific field. Similarity measures between two IF sets are related to their commonality on the information conveyed by them. Since it is difficult to measure the amount of information hidden in IF sets, we cannot compare two IF sets directly. Therefore, similarity measure and its counterpart, distance measure, play an important role in discriminating IF sets. With the development of IF set theory, the definition of similarity measure for IF sets has also received considerable attention in recent years [7, 8]. It has developed and will continue to develop into an important tool for decision making, fault detection, pattern recognition, machine learning, image processing, etc.

For its fundamental importance in application, many similarity measures have been proposed. The first study was carried out by Szmidt and Kacprzyk [9]. They applied Hamming distance and Euclidian distance to IF environment and comparing them with the approaches used for ordinary FSs. Following this work, many researchers presented similarity measures for IF sets by extending the well-known distance measures, such as Hamming distance, Euclidian distance, and Hausdorff distance [1016]. Meanwhile, some studies defined new similarity measures for IF sets by defining some intermediate variables based on membership and non-membership degrees [7, 1721]. For example, Li and Cheng [17] suggested a new similarity measure for IF sets based on the definition of φ A . Recently, many novel similarity measures are emerging in an endless stream. This trend can be illustrated by similarity measures defined based on cosine similarity [8], Sugeno integral [22], interval comparison [23], intuitionistic entropy [24], and so on. In addition, Boran and Akay [25] proposed a new general type of similarity measure for IF sets with two parameters, expressing norm and the level of uncertainty, respectively. This similarity measure can also make sense in terms of counter-intuitive cases. As a comprehensive study on similarity measures of IF sets, Papakostas et al. [26] investigated the main theoretical and computational properties of the measures, as well as the relationships between them. Moreover, a comparison of the distance and similarity measures was carried out from a pattern recognition point of view.

Among the proposed similarity measures between IF sets, some of those, however, cannot satisfy the axioms of similarity, or provide counter-intuitive cases, or are produced by complex formats. Therefore, we propose a new similarity measures between IF sets in this paper, based on the cosine similarity and Euclidean distance between IF sets. Axiomatic definitions of similarity and the distance measures are first presented. Then the relation between similarity and distance measure follows. Similarity matrix is also defined to describe relationships among more than two IF sets. The properties of similarity matrix are examined to explore the performance of our new similarity measure, which is defined after critically analyzing the cosine similarity and similarity generated by Euclidean distance. Properties and performances of the proposed similarity measure are indicated by both mathematical proofs and illustrative examples.

The remainder of this paper is organized as follows. Section 2 recalls the definitions related to the IF sets. In Sect. 3, distance measure, similarity measure, similarity matrix, and their properties with proofs are proposed. The new similarity measure is defined in Sect. 4. Its properties are also proved in this section. Comparison between similarity measures and illustration of the positive definiteness of similarity matrix is carried out in Sect. 5. We come to the conclusion of this paper in Sect. 6.

2 Preliminaries

In this section, we briefly recall the basic concepts related to IF set, and then list the properties of the axiomatic definition for similarity measures.

Definition 1

[1] Let \(X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \}\) be a universe of discourse, then a FS A in X is defined as follows:

$$A = \left\{ {\left\langle {x,\mu_{A} (x)} \right\rangle \left| {x \in X} \right.} \right\}$$
(1)

where \(\mu_{A} (x):X \to [0,1]\) is the membership degree.

Definition 2

[4] An IF set A in X defined by Atanassov can be written as

$$A = \left\{ {\left\langle {x,\mu_{A} (x),v_{A} (x)} \right\rangle \left| {x \in X} \right.} \right\}$$
(2)

where \(\mu_{A} (x):X \to [0,1]\) and \(v_{A} (x):X \to [0,1]\) are membership degree and non-membership degree, respectively, with the condition:

$$0 \le \mu_{A} (x) + v_{A} (x) \le 1$$
(3)

π A (x) determined by the following expression:

$$\pi_{A} (x) = 1 - \mu_{A} (x) - v_{A} (x)$$
(4)

is called the hesitancy degree of the element x ∊ X to the set A, and \(\pi_{A} (x) \in [0,1],\,\forall x \in X\).

π A (x) is also called the intuitionistic index of x to A. Greater π A (x) indicates more vagueness on x. Obviously, when \(\pi_{A} (x) = 0,\,\forall x \in X\), the IF set degenerates into an ordinary FS.

In the sequel, the couple \(\left\langle {\mu_{A} (x),v_{A} (x)} \right\rangle\) is called an IF set or IF value for clarity. Let IFSs(X) denote the set of all IF sets in X.

Definition 3

For A ∊ IFSs(X) and B ∊ IFSs(X), some relations between them are defined as:

(R1):

A ⊂ B iff \(\forall x \in X\,\mu_{A} (x) \le \mu_{B} (x),v_{A} (x) \ge v_{B} (x)\)

(R2):

A = B iff \(\forall x \in X\,\mu_{A} (x) = \mu_{B} (x),v_{A} (x) = v_{B} (x)\)

(R3):

\(A^{C} = \left\{ {\left\langle {x,v_{A} (x),\mu_{A} (x)} \right\rangle \left| {x \in X} \right.} \right\}\), where A C is the complement of A.

It is worth noting that besides Definition 2 there are other possible representations of IF sets proposed in the literature. Hong and Choi [27] proposed to use an interval representation \(\left[ {\mu_{A} (x),1 - v_{A} (x)} \right]\) of IF set A in X instead of pair \(\left\langle {\mu_{A} (x),v_{A} (x)} \right\rangle\). This approach is equivalent to the interval valued FSs interpretation of IF set, where the interval \(\left[ {\mu_{A} (x),1 - v_{A} (x)} \right]\) represents the membership degree of x ∊ X to the set A. Obviously, \(\left[ {\mu_{A} (x),1 - v_{A} (x)} \right]\) is a valid interval, since \(\mu_{A} (x) \le 1 - v_{A} (x)\) always holds for \(\mu_{A} (x) + v_{A} (x) \le 1\).

3 Distance and similarity measures for IF sets

3.1 Related definitions and properties

Generally, a distance is a measure of the difference between two elements of a set. For the case of IF sets, the distance between them must satisfy the axiomatic definitions of a metric distance. Moreover, the distance should not counter-intuitive analysis, i.e., the distance measure must be capable of reflecting the similarity among the IF sets. So we have next definition.

Definition 4

Let D denote a mapping \(D:IFS\, \times \, IFS \to [0,1]\), if D(AB) satisfies the following properties, is called a distance between \(A \in IFSs(X)\) and \(B \in IFSs(X)\).

(D1):

\(0 \le D(A,B) \le 1,\)

(D2):

\(D(A,B) = 0 \Leftrightarrow A = B,\)

(D3):

\(D(A,B) = D(B,A),\)

(D4):

If \(A \subseteq B \subseteq C\), then \(D(A,B) \le D(A,C)\), and \(D(B,C) \le D(A,C),\)

(D5):

\(D(A,B) + D(B,C) \ge D(A,C).\)

As the complementary concept of distance measure, the similarity measure between two IF sets can be described in next definition.

Definition 5

A mapping \(S:IFS \times IFS \to [0,1]\) is called a degree of similarity between \(A \in IFSs(X)\) and \(B \in IFSs(X)\) a, if \(S(A,B)\) satisfies the following properties:

(S1):

\(0 \le S(A,B) \le 1,\)

(S2):

\(S(A,B) = 1 \Leftrightarrow A = B,\)

(S3):

\(S(A,B) = S(B,A),\)

(S4):

If \(A \subseteq B \subseteq C\), then \(S(A,B) \ge S(A,C)\) and \(S(B,C) \ge S(A,C).\)

Theorem 1

Let D denote the distance measure between IF sets. Then, \(S_{D} = 1 - D\) is the similarity measure between IF sets.

Proof

As the distance measure between IF sets, D satisfies the conditions in Definition 4 as:

$$\begin{aligned} & 0 \le D(A,B) \le 1, \\ & D(A,B) = 0 \Leftrightarrow A = B,\,D(A,B) = D(B,A) \\ \end{aligned}$$

Considering \(S_{D} = 1 - D\), we can get the following expressions straightforwardly:

$$\begin{aligned} & 0 \le S_{D} (A,B) \le 1, \\ & S_{D} (A,B) = 1 \Leftrightarrow D(A,B) = 0 \Leftrightarrow A = B, \\ & S_{D} (A,B) = S_{D} (B,A). \\ \end{aligned}$$

Given IF sets A, B and C satisfying \(A \subseteq B \subseteq C\), we have \(D(A,B) \le D(A,C)\) and \(D(B,C) \le D(A,C).\)

So we get:

$$1 - D(A,B) \ge 1 - D(A,C),\quad 1 - D(B,C) \ge 1 - D(A,C).$$

Thus, \(S_{D} (A,B) \ge S_{D} (A,C)\), \(S_{D} (B,C) \ge S_{D} (A,C)\).

Hence, \(S_{D} (A,B) = 1 - D\) is the similarity measure between IF sets. \(\square\)

Theorem 1 says that the distance measure can be applied to define its complementary concept, similarity measure. Given a similarity measure S, we can learn from the proof of Theorem 1 that D = 1 − S satisfies all the properties in Definition 4, excluding D5, the triangle inequality. So all the distance measures can be transformed to a similarity measure, but not vice versa. We can say that the axiomatic definition for distance measure is stricter than that of the similarity measure.

Since the inception of IF sets, many similarity measures between IF sets have been proposed in the technical literature. Table 1 summarizes several well-known similarity measures that will be analyzed in this paper. In this table, we let \(X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \}\) be a universe of discourse, \(A \in IFSs(X)\) and \(B \in IFSs(X)\) be two IF sets in X, denoted by \(A = \left\{ {\left\langle {x,\mu_{A} (x),v_{A} (x)} \right\rangle \left| {x \in X} \right.} \right\}\) and \(B = \left\{ {\left\langle {x,\mu_{B} (x),v_{B} (x)} \right\rangle \left| {x \in X} \right.} \right\}\), respectively. For clarity, we only give the expressions of similarity measures, with an absence of the interpretations of other intermediate variables, which can be found in related references.

Table 1 Existing similarity measures

3.2 Similarity matrix for IF sets

The distance and similarity measures can demonstrate the relationships between two IF sets. However, under most circumstances, more than two IF sets will we are confronted with, where the distance and similarity measures are not suitable to cope with the relationships among them. So we define the similarity matrix for IF sets.

Definition 6

Let \(A_{1} ,A_{2} , \ldots ,A_{N}\) denote \(N(N \ge 2)\) IF sets in the universe of discourse \(X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \} .\) S denotes the similarity measure between IF sets. The similarity matrix between them is defined as:

$$\varvec{S} = \left[ {\begin{array}{*{20}c} {S(A_{1} ,A_{1} )} & {S(A_{1} ,A_{2} )} & \cdots & {S(A_{1} ,A_{N} )} \\ {S(A_{2} ,A_{1} )} & {S(A_{2} ,A_{2} )} & \cdots & {S(A_{2} ,A_{N} )} \\ \vdots & \vdots & {} & \vdots \\ {S(A_{N} ,A_{1} )} & {S(A_{N} ,A_{2} )} & \cdots & {S(A_{N} ,A_{N} )} \\ \end{array} } \right]$$
(5)

Since \(S(A_{i} ,A_{j} ) = S(A_{j} ,A_{i} )\) for \(i = 1,2, \ldots ,N\) and \(S(A_{i} ,A_{i} ) = 1\), the similarity matrix \(\varvec{S}\) is a square and symmetric matrix, with 1 as its diagonal elements.

Theorem 2

The similarity matrix \(\varvec{S}\) between IF sets is a non-singular matrix.

Proof

Let’s suppose that \(\varvec{S}\) is a singular matrix. Then at least two of its column vectors are linearly dependent. Without any loss of generality, we can assume \(\varvec{x}_{j}\) and \(\varvec{x}_{k}\) are linearly dependent. So we have:

$$\varvec{x}_{j} = t \cdot \varvec{x}_{k} .$$

Hence,

$$x_{jp} = t \cdot x_{kp} \quad {\text{for all}}\quad p = 1,2, \ldots ,N.$$

Given p = j and p = k, we have:

$$x_{jj} = t \cdot x_{kj} = 1,\quad x_{jk} = t \cdot x_{kk} = t$$

And then, two contradictory equations can be achieved as:

$$t = 1/x_{kj} > 1\quad {\text{and}}\quad t = x_{jk} < 1.$$

Subsequently, the assumption that S is a singular matrix cannot stand up.

So the similarity matrix S between IF sets is a non-singular matrix.\(\square\)

Theorem 3

Let D be a metric distance measure between IF sets. The similarity matrix S D defined by \(S_{D} = 1 - D\) according Eq. (5) is a positive definite matrix.

Proof

Theorem 1 indicates that \(S_{D} = 1 - D\) is a similarity measure between IF sets. The similarity defined by matrix can be expressed as:

$$\varvec{S}_{D} = \left[ {\begin{array}{*{20}c} {1 - D(A_{1} ,A_{1} )} & {1 - D(A_{1} ,A_{2} )} & \cdots & {1 - D(A_{1} ,A_{N} )} \\ {1 - D(A_{2} ,A_{1} )} & {1 - D(A_{2} ,A_{2} )} & \cdots & {1 - D(A_{2} ,A_{N} )} \\ \vdots & \vdots & \ddots & \vdots \\ {1 - D(A_{N} ,A_{1} )} & {1 - D(A_{N} ,A_{2} )} & \cdots & {1 - D(A_{N} ,A_{N} )} \\ \end{array} } \right]$$

Given \(D(A_{i} ,A_{i} ) = 0\) and \(D(A_{i} ,A_{j} ) = D(A_{j} ,A_{i} )\) for i = 1, 2, …, N, we also get:

$$\begin{aligned} \varvec{S}_{D} \text{ = }\varvec{S}_{D}^{{\mathbf{T}}} \hfill \\ \quad {\kern 1pt} = \left[ {\begin{array}{*{20}c} 1 & {1 - D(A_{1} ,A_{2} )} & \cdots & {1 - D(A_{1} ,A_{N} )} \\ {1 - D(A_{2} ,A_{1} )} & 1 & \cdots & {1 - D(A_{2} ,A_{N} )} \\ \vdots & \vdots & {} & \vdots \\ {1 - D(A_{N} ,A_{1} )} & {1 - D(A_{N} ,A_{2} )} & \cdots & 1 \\ \end{array} } \right] \hfill \\ \end{aligned}$$

Since the eigenvalues of symmetric matrix are all real numbers, the similarity matrix S D has N real eigenvalues (including repeated eigenvalues), denoted by λ 1λ 2, …, λ N .

Let λ be an arbitrary eigenvalue of S, i.e., \(\lambda \in \left\{ {\lambda_{1} ,\lambda_{2} , \ldots ,\lambda_{N} } \right\}\). Following the Gerschgorin Theorem, we can get:

$$\left| {\lambda - 1} \right| \le \sum\limits_{\begin{subarray}{l} j = 1 \\ j \ne i \end{subarray} }^{N} {\left( {1 - D(A_{i} ,A_{j} )} \right)} ,{\kern 1pt} \quad \exists i \in \{ 1,2, \ldots ,N\}$$

Thus,

$$\begin{aligned} \left| {\lambda - 1} \right| & \le \sum\limits_{\begin{subarray}{l} j = 1 \\ j \ne i \end{subarray} }^{N} {\left( {1 - D(A_{i} ,A_{j} )} \right)} \\ & = (N - 1) - \sum\limits_{\begin{subarray}{l} j = 1 \\ j \ne i \end{subarray} }^{N} {D(A_{i} ,A_{j} )} ,{\kern 1pt} \quad \exists i \in \{ 1,2, \ldots ,N\} \\ \end{aligned}$$

Considering the following relations:

$$\sum\limits_{k = 1}^{N} {\lambda_{k} } = \sum\limits_{i = 1}^{N} {S_{ii} } = N\quad {\text{and}}\quad N \cdot \lambda_{\hbox{max} } \ge \sum\limits_{k = 1}^{N} {\lambda_{k} } \ge N \cdot \lambda_{\hbox{min} } ,$$

where λ min and λ max are the minimum and maximum eigenvalues, respectively, we have λ min ≤ 1 and λ max ≥ 1.

Then, can get:

$$\begin{aligned} \left| {\lambda_{\hbox{min} } - 1} \right| & = 1 - \lambda_{\hbox{min} } \\ & \le (N - 1) - \sum\limits_{\begin{subarray}{l} j = 1 \\ j \ne i \end{subarray} }^{N} {D(A_{i} ,A_{j} )} ,\quad \exists i \in \{ 1,2, \ldots ,N\} , \\ \lambda_{\hbox{min} } & \ge 2 + \sum\limits_{\begin{subarray}{l} j = 1 \\ j \ne i \end{subarray} }^{N} {D(A_{i} ,A_{j} )} - N,{\kern 1pt} \quad \exists i \in \{ 1,2, \ldots ,N\} . \\ \end{aligned}$$

No generality will be lost by considering i = 1, i.e., λ min is in the first Gerschgorin circle of S D. So we have:

$$\lambda_{\hbox{min} } \ge 2 + \sum\limits_{j = 2}^{N} {D(A_{1} ,A_{j} )} - N.$$
  1. (i)

    For N = 2, we have:

$$\lambda_{\hbox{min} } \ge 2 + D(A_{1} ,A_{2} ) - 2 = D(A_{1} ,A_{2} ) > 0.$$
  1. (ii)

    For N = 3, we have:

$$\begin{aligned} \lambda_{\hbox{min} } & \ge 2 + D(A_{1} ,A_{2} ) + D(A_{1} ,A_{3} ) - 3 \\ & = D(A_{1} ,A_{2} ) + D(A_{1} ,A_{3} ) - 1 \\ & \ge D(A_{2} ,A_{3} ) - 1. \\ \end{aligned}$$

Considering the randomicity of \(D(A_{2} ,A_{3} )\) and \(D(A_{2} ,A_{3} ) - 1 \le 0\), we can conclude that only λ min ≥ 0 can make \(\lambda_{\hbox{min} } \ge D(A_{2} ,A_{3} ) - 1\) holds for any \(D(A_{2} ,A_{3} )\). So we have \(\lambda_{\hbox{min} } \ge 0\).

  1. (iii)

    Given \(N \ge 4\) and \(D(A_{1} ,A_{2} ) + D(A_{1} ,A_{3} ) \ge D \;(A_{2} ,A_{3} )\), we have:

$$\begin{aligned} \lambda_{\hbox{min} } & \ge 2 + \sum\limits_{j = 2}^{N} {D(A_{1} ,A_{j} )} - N \\ & = 2 - N + \left( {D(A_{1} ,A_{2} ) + D(A_{1} ,A_{3} )} \right) + \sum\limits_{j = 4}^{N} {D(A_{1} ,A_{j} )} \\ & \ge 2 - N + D(A_{2} ,A_{3} ) + \sum\limits_{j = 4}^{N} {D(A_{1} ,A_{j} )} . \\ \end{aligned}$$

We can also get

$$\begin{aligned} \,2 - N + D(A_{2} ,A_{3} ) + \sum\limits_{j = 4}^{N} {D(A_{1} ,A_{j} )} \\ & \quad \le 2 - N + 1 + (N - 3) = 0. \\ \end{aligned}$$

Since \(\lambda_{\hbox{min} } \ge {\kern 1pt} {\kern 1pt} 2 - N + D(A_{2} ,A_{3} ) + \sum\nolimits_{j = 4}^{N} {D(A_{1} ,A_{j} )}\) is constant for any arbitrary \(D(A_{2} ,A_{3} )\) and \(D(A_{1} ,A_{j} )\) (\(j = 4,5, \ldots ,N\)), λ min should be not less than the maximum of the right hand side. Then we have \(\lambda_{\hbox{min} } \ge 0\).

Considering (i)–(iii), we can conclude that all the eigenvalues of \(\varvec{S}_{D}\) are nonnegative. So the similarity matrix \(\varvec{S}_{D}\) is positive semidefinite.

Taking \(\prod\nolimits_{i = 1}^{N} {\lambda_{i} \text{ = det}{\kern 1pt} \varvec{S}_{D} }\) and \(\text{det}{\kern 1pt} \varvec{S}_{D} \ne 0\) (S D is non-singular) into consideration, we can know there is no zero eigenvalue in the eigenvalues of S D, i.e., all the eigenvalues of S D are strictly positive.

So the similarity matrix S D defined by \(S_{D} = 1 - D\) is positive definite. \(\square\)

4 A new similarity measure between IF sets

In the similarity measures shown in Table 1, some of them are based on the well-known distances measures, such as the Hamming distance and the Euclidian distance, while others are characterized by the linear or non-linear combinations of the membership and non-membership functions of the IF sets, respectively. Since the similarity measures based on the Euclidian distance are of definitude physical meaning, we will review the generation of Euclidian distance in IF environment [11]. Moreover, some of its properties will be presented along with their proofs.

Definition 7

[11] The distance between two IF sets A and B in \(X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \}\) can be defined as:

$$D_{o} (A,B) = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{A} (x_{i} ) - \mu_{B} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} } \right)} }}{2n}}$$
(6)

Theorem 4

\(d_{o} (A,B)\) is a metric distance measure between IF sets.

Proof

(1) Since \(\mu (x),v(x) \in [0,1]\), we have:

$$- 1 \le \mu_{A} (x_{i} ) - \mu_{B} (x_{i} ) \le 1\quad {\text{and}}\quad - 1 \le v_{A} (x_{i} ) - v_{B} (x_{i} ) \le 1,$$

Hence,

$$0 \le \left( {\mu_{A} (x_{i} ) - \mu_{B} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} \le 1 + 1 = 2$$

So we get:

$$\begin{aligned} 0 & \le \frac{{\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{A} (x_{i} ) - \mu_{B} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} } \right)} }}{2n} \\ & \le \frac{{\sum\nolimits_{i = 1}^{n} 2 }}{2n} = 1 \\ \end{aligned}$$

Thus, \(0 \le D_{o} (A,B) \le 1\).

(2) \(D_{o} (A,A) = 0\) is straightforward. Conversely, if \(D_{o} (A,B) = 0\), according to the definition of D o , it must follow that \(\mu_{A} (x) = \mu_{B} (x)\) and \(v_{A} (x) = v_{B} (x)\) for all x ∊ X, i.e., A = B. So we have \(D_{o} (A,B) = 0 \Leftrightarrow A = B\).

(3) It is evident that \(D_{o} (A,B) = D_{o} (B,A)\).

(4) Given A ⊆ B ⊆ C, we have \(\mu_{A} (x) \le \mu_{B} (x) \le \mu_{C} (x)\) and \(v_{A} (x) \ge v_{B} (x) \ge v_{C} (x)\) for all x ∊ X. Therefore,

$$\begin{aligned} 0 & \le \mu_{B} (x_{i} ) - \mu_{A} (x_{i} ) \le \mu_{C} (x_{i} ) - \mu_{A} (x_{i} ), \\ 0 & \le \mu_{C} (x_{i} ) - \mu_{B} (x_{i} ) \le \mu_{C} (x_{i} ) - \mu_{A} (x_{i} ), \\ 0 & \le v_{A} (x_{i} ) - v_{B} (x_{i} ) \le v_{A} (x_{i} ) - v_{C} (x_{i} ), \\ 0 & \le v_{B} (x_{i} ) - v_{C} (x_{i} ) \le v_{A} (x_{i} ) - v_{C} (x_{i} ),\quad {\text{for}}\,\,\,\forall x_{i} \in X. \\ \end{aligned}$$

Consequently,

$$\begin{aligned} & \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{B} (x_{i} ) - \mu_{A} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} } \right)} }}{2n}} \\ & \quad \le \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{C} (x_{i} ) - \mu_{A} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{C} (x_{i} )} \right)^{2} } \right)} }}{2n}} , \\ & \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{C} (x_{i} ) - \mu_{B} (x_{i} )} \right)^{2} + \left( {v_{B} (x_{i} ) - v_{C} (x_{i} )} \right)^{2} } \right)} }}{2n}} \\ & \quad \le \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{C} (x_{i} ) - \mu_{A} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{C} (x_{i} )} \right)^{2} } \right)} }}{2n}} . \\ \end{aligned}$$

According to Eq. (6), we can get: \(D_{o} (A,B) \le D_{o} (A,C)\) and \(D_{o} (B,C) \le D_{o} (A,C)\).

(5) By the Cauchy inequality, \(\left( {\sum\nolimits_{i = 1}^{n} {a_{i} b_{i} } } \right)^{2} \le \left( {\sum\nolimits_{i = 1}^{n} {a_{{_{i} }}^{2} } } \right)\left( {\sum\nolimits_{i = 1}^{n} {b_{{_{i} }}^{2} } } \right)\), we can get:

$$\begin{aligned} \sum\limits_{i = 1}^{n} {\left( {a_{i} + b_{i} } \right)^{2} } & = \sum\limits_{i = 1}^{n} {a_{i}^{2} } + \sum\limits_{i = 1}^{n} {b_{i}^{2} } + 2\sum\limits_{i = 1}^{n} {a_{i} b_{i} } \\ & \le \sum\limits_{i = 1}^{n} {a_{i}^{2} } + \sum\limits_{i = 1}^{n} {b_{i}^{2} } + 2\left( {\sum\limits_{i = 1}^{n} {a_{i}^{2} \sum\limits_{i = 1}^{n} {b_{i}^{2} } } } \right)^{1/2} \\ & = \left( {\sqrt {\sum\limits_{i = 1}^{n} {a_{i}^{2} } } + \sqrt {\sum\limits_{i = 1}^{n} {b_{i}^{2} } } } \right)^{2} . \\ \end{aligned}$$

Making the following assignments:

$$\begin{aligned} a = \sum\limits_{i = 1}^{n} {\left( {\mu_{A} (x_{i} ) - \mu_{B} (x_{i} )} \right)^{2} } ,\quad b = \sum\limits_{i = 1}^{n} {\left( {\mu_{B} (x_{i} ) - \mu_{C} (x_{i} )} \right)^{2} } \hfill \\ g = \sum\limits_{i = 1}^{n} {\left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} } ,\quad h = \sum\limits_{i = 1}^{n} {\left( {v_{B} (x_{i} ) - v_{C} (x_{i} )} \right)^{2} } \hfill \\ \end{aligned}$$

we have:

$$\begin{aligned} & 2n\left( {D_{o} (A,C)} \right)^{2} \\ & \quad =\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{A} (x_{i} ) -\mu_{C} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{C}(x_{i} )} \right)^{2} } \right)} \\ & \quad = \sum\nolimits_{i =1}^{n} {\left( \begin{array}{l} \left( {\mu_{A} (x_{i} ) - \mu_{B}(x_{i} ) + \mu_{B} (x_{i} ) - \mu_{C} (x_{i} )} \right)^{2} \hfill\\ \quad + \left( {v_{A}(x_{i} ) - v_{B} (x_{i} ) + v_{B} (x_{i} ) - v_{C} (x_{i} )}\right)^{2} \hfill \\ \end{array} \right)} \\ & \quad \le \left({\sqrt {\sum\limits_{i = 1}^{n} {\left( {\mu_{A} (x_{i} ) - \mu_{B}(x_{i} )} \right)^{2} } } + \sqrt {\sum\limits_{i = 1}^{n} {\left({\mu_{B} (x_{i} ) - \mu_{C} (x_{i} )} \right)^{2} } } } \right)^{2}\\ & \qquad {\kern 1pt} + \left( {\sqrt {\sum\limits_{i = 1}^{n}{\left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} } } + \sqrt{\sum\limits_{i = 1}^{n} {\left( {v_{B} (x_{i} ) - v_{C} (x_{i} )}\right)^{2} } } } \right)^{2} \\ & \quad \le \left( {\sqrt a + \sqrt b } \right)^{2} + \left( {\sqrt g + \sqrt h } \right)^{2} \\ & \quad= a + b + g + h + 2\sqrt {ab} + 2\sqrt {gh} . \\ \end{aligned}$$
$$\begin{aligned} & 2n\left( {D_{o} (A,B) + D_{o} (B,C)} \right)^{2} \\ & \quad = \left( \begin{array}{l} \sqrt {\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{A} (x_{i} ) - \mu_{B} (x_{i} )} \right)^{2} + \left( {v_{A} (x_{i} ) - v_{B} (x_{i} )} \right)^{2} } \right)} } \hfill \\ + \sqrt {\sum\nolimits_{i = 1}^{n} {\left( {\left( {\mu_{B} (x_{i} ) - \mu_{C} (x_{i} )} \right)^{2} + \left( {v_{B} (x_{i} ) - v_{C} (x_{i} )} \right)^{2} } \right)} } \hfill \\ \end{array} \right)^{2} \\ & \quad = \left( {\sqrt {a + g} + \sqrt {b + h} } \right)^{2} \\ & \quad = a + b + g + h + 2\sqrt {\left( {a + g} \right)\left( {b + h} \right)} . \\ \end{aligned}$$

Considering such inequality:

$$\begin{aligned} & \left( {a + g} \right)\left( {b + h} \right) - \left( {\sqrt {ab} + \sqrt {gh} } \right)^{2} \\ & \quad = ah + bg - 2\sqrt {abgh} \\ & \quad = \left( {\sqrt {ah} - \sqrt {bg} } \right)^{2} \ge 0, \\ \end{aligned}$$

we get:

$$\begin{aligned} & \sqrt {\left( {a + g} \right)\left( {b + h} \right)} \ge \sqrt {ab} + \sqrt {gh} , \\ & \quad a + b + g + h + 2\sqrt {\left( {a + g} \right)\left( {b + h} \right)} \\ & \quad \ge a + b + g + h + 2\sqrt {ab} + 2\sqrt {gh} . \\ \end{aligned}$$

And then

$$2n\left( {D_{o} (A,C)} \right)^{2} \le 2n\left( {D_{o} (A,B) + D_{o} (B,C)} \right)^{2}.$$

Finally we obtain \(D_{o} (A,C) \le D_{o} (A,B) + D_{o} (B,C)\). So \(D_{o} (A,B)\) is a metric distance measure between IF sets.\(\square\)

According to Theorem 1, \(S_{o} = 1 - D_{o} (A,B)\) is a similarity measure between IF sets. This similarity has been proposed by Li et al. [16]. However, this similarity measure has a drawback that it is not sensitive to the change of IF sets. Considering an example where \(A = \left\{ {\left\langle {x,0.3,0.2} \right\rangle } \right\}\), \(B = \left\{ {\left\langle {x,0.4,0.3} \right\rangle } \right\}\), \(C = \left\{ {\left\langle {x,0.4,0.1} \right\rangle } \right\}\), \(D = \left\{ {\left\langle {x,0.2,0.1} \right\rangle } \right\}\), \(E = \left\{ {\left\langle {x,0.2,0.3} \right\rangle } \right\}\), we can find \(S_{o} (A,B) = S_{o} (A,C) = S_{o} (A,D) = S_{o} (A,E)\). So S o is not capable of discriminating the difference between IF sets.

Besides, there is another interesting similarity measure between IF sets, cosine similarity measure, defined by Ye [8]. He has proved that \(C_{IFS} (A,B)\) satisfied properties (S1) and (S3) in Definition 5. He only illustrated that \(C_{IFS} (A,B) = 1\) if A = B. However, for two different IF sets \(A = \left\{ {\left\langle {x,0.3,0.3} \right\rangle } \right\}\) and \(B = \left\{ {\left\langle {x,0.4,0.4} \right\rangle } \right\}\), we can get \(C_{IFS} (A,B) = 1\). That is, the condition (S2) in Definition 5 is not satisfied. So we can say \(C_{IFS} (A,B)\) is not a genuine similarity measure.

Taking a close examine on \(C_{IFS} (A,B)\) and \(D_{o} (A,B)\), we can get that the cosine similarity measure indicates the angle which quantifies how orthogonal two IF sets are, while the distance between IF sets can quantify how close two IF sets are from each other. Therefore, we can combine the cosine similarity measure and the distance measure together to define a new similarity measure for IF sets.

Definition 8

Let A and B be two IF sets in \(X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \}\). A new similarity measure between them can be defined as:

$$S_{F} (A,B) = \frac{1}{2}\left( {C_{IFS} (A,B) + 1 - D_{o} (A,B)} \right)$$
(7)

Theorem 5

The measure \(S_{F} (A,B)\) is a similarity measure between IF sets A and B.

Proof

(1) \(C_{IFS} (A,B)\) can be taken as the cosine value between vectors, so \(0 \le C_{IFS} (A,B) \le 1\). According to Theorems 1 and 3, it is evident that \(0 \le 1 - D_{o} (A,B) \le 1\). Then we can get: \(0 \le S_{F} (A,B) \le 1\).

(2) \(S_{F} (A,A) = 1\) is straightforward. Since \(0 \le C_{IFS} (A,B) \le 1\) and \(0 \le 1 - D_{o} (A,B) \le 1\), \(C_{IFS} (A,B) = 1 - D_{o} (A,B) = 1\) must holds when \(S_{F} (A,B) = 1\).

\(C_{IFS} (A,B) = 1\) indicates that \(\mu_{A} (x) = k \cdot v_{A} (x),\mu_{B} (x) = k \cdot v_{B} (x)\) with \(k \in [0, + \infty )\) for all x ∊ X. Considering \(D_{o} (A,B) = 0 \Rightarrow A = B\) shown in Theorem 3, we can get the converse proposition \(S_{F} (A,B) = 1 \Rightarrow A = B\).

Finally, we get \(S_{F} (A,B) = 1 \Leftrightarrow A = B\).

(3) \(S_{F} (A,B) = S_{F} (B,A)\) can be obtain straightforwardly.

(4) For three IF sets A, B and C satisfying \(A \subseteq B \subseteq C\), we have \(\mu_{A} (x) \le \mu_{B} (x) \le \mu_{C} (x)\) and \(v_{A} (x) \ge v_{B} (x) \ge v_{C} (x)\) for all x ∊ X.

A function \(f(y,z)\) can be defined as:

$$f(y,z) = \frac{ay + bz}{{\sqrt {a^{2} + b^{2} } \sqrt {y^{2} + z^{2} } }}$$

We can calculate its derivatives as:

$$\begin{aligned} \frac{\partial f}{\partial y} & = \frac{{a\sqrt {y^{2} + z^{2} } - \left( {ay + bz} \right)\frac{y}{{\sqrt {y^{2} + z^{2} } }}}}{{\sqrt {a^{2} + b^{2} } \left( {y^{2} + z^{2} } \right)}} \\ & = \frac{{az^{2} - byz}}{{\sqrt {a^{2} + b^{2} } \sqrt {y^{2} + z^{2} } \left( {y^{2} + z^{2} } \right)}} \\ & = \frac{{z\left( {az - by} \right)}}{{\sqrt {a^{2} + b^{2} } \sqrt {y^{2} + z^{2} } \left( {y^{2} + z^{2} } \right)}} \\ \end{aligned}$$
$$\begin{aligned} \frac{\partial f}{\partial z} & = \frac{{b\sqrt {y^{2} + z^{2} } - \left( {ay + bz} \right)\frac{z}{{\sqrt {y^{2} + z^{2} } }}}}{{\sqrt {a^{2} + b^{2} } \left( {y^{2} + z^{2} } \right)}} \\ & = \frac{{by^{2} - ayz}}{{\sqrt {a^{2} + b^{2} } \sqrt {y^{2} + z^{2} } \left( {y^{2} + z^{2} } \right)}} \\ & = \frac{{ - y\left( {az - by} \right)}}{{\sqrt {a^{2} + b^{2} } \sqrt {y^{2} + z^{2} } \left( {y^{2} + z^{2} } \right)}} \\ \end{aligned}$$

If \(y \ge a,z \le b\), we have \(\frac{\partial f}{\partial y} \le 0\), \(\frac{\partial f}{\partial z} \ge 0\). So for \(a = \mu_{A} (x) \le \mu_{B} (x) \le \mu_{C} (x)\) and \(b = v_{A} (x) \ge v_{B} (x) \ge v_{C} (x)\), we have \(f(\mu_{C} (x),v_{C} (x)) \le f(\mu_{B} (x),v_{B} (x))\), which can be written as:

$$\begin{aligned} & & \frac{{\mu_{A} (x)\mu_{C} (x) + v_{A} (x)v_{C} (x)}}{{\sqrt {\left( {\mu_{A} (x)} \right)^{2} + \left( {v_{A} (x)} \right)^{2} } \sqrt {\left( {\mu_{C} (x)} \right)^{2} + \left( {v_{C} (x)} \right)^{2} } }} \\ & & \quad \le \frac{{\mu_{A} (x)\mu_{B} (x) + v_{A} (x)v_{B} (x)}}{{\sqrt {\left( {\mu_{A} (x)} \right)^{2} + \left( {v_{A} (x)} \right)^{2} } \sqrt {\left( {\mu_{B} (x)} \right)^{2} + \left( {v_{B} (x)} \right)^{2} } }} \\ \end{aligned}$$
(8)

If \(y \le a,z \ge b\), we have \(\frac{\partial f}{\partial y} \ge 0\), \(\frac{\partial f}{\partial z} \le 0\). So for \(a = \mu_{C} (x) \ge \mu_{B} (x) \ge \mu_{A} (x)\) and \(b = v_{C} (x) \le v_{B} (x) \le v_{A} (x)\), we have \(f(\mu_{A} (x),v_{A} (x)) \le f(\mu_{B} (x),v_{B} (x))\), which can be written as:

$$\begin{aligned} & \frac{{\mu_{A} (x)\mu_{C} (x) + v_{A} (x)v_{C} (x)}}{{\sqrt {\left( {\mu_{A} (x)} \right)^{2} + \left( {v_{A} (x)} \right)^{2} } \sqrt {\left( {\mu_{C} (x)} \right)^{2} + \left( {v_{C} (x)} \right)^{2} } }} \\ & \quad \le \frac{{\mu_{C} (x)\mu_{B} (x) + v_{C} (x)v_{B} (x)}}{{\sqrt {\left( {\mu_{C} (x)} \right)^{2} + \left( {v_{C} (x)} \right)^{2} } \sqrt {\left( {\mu_{B} (x)} \right)^{2} + \left( {v_{B} (x)} \right)^{2} } }} \\ \end{aligned}$$
(9)

Since \(\mu_{A} (x) \le \mu_{B} (x) \le \mu_{C} (x)\) and \(v_{A} (x) \ge v_{B} (x) \ge v_{C} (x)\) for all \(x \in X\), (8) and (9) holds for all \(x \in X\). So we get: \(C_{IFS} (A,C) \le C_{IFS} (A,B)\), \(C_{IFS} (A,C) \le C_{IFS} (B,C)\).

Moreover, \(D_{o} (A,B) \le D_{o} (A,C)\) and \(D_{o} (B,C) \le D_{o} (A,C)\) have been proved in the proof of Theorem 3. So we have \(1 - D_{o} (A,C) \le 1 - D_{o} (A,B)\) and \(1 - D_{o} (A,C) \le 1 - D_{o} (B,C)\). We can finally get: \(S_{F} (A,C) \le S_{F} (A,B)\) and \(S_{F} (A,C) \le S_{F} (B,C)\).

Thus, \(S_{F} (A,B)\) satisfies all the properties in Definition 5, and it is a similarity measure between IF sets A and B. \(\square\)

Theorem 6

Let \(A_{1} ,A_{2} , \ldots ,A_{N}\) denote N (N ≥ 2) IF sets in the universe of discourse X. The similarity matrix S F defined by S F is a positive definite matrix.

Proof

The similarity matrix S F can be decomposed as:

$$\varvec{S}_{F} = \frac{1}{2}\varvec{S}_{o} + \frac{1}{2}\varvec{S}_{C}$$

where

$$\begin{aligned} \varvec{S}_{o} & = \left[ {\begin{array}{*{20}c} {S_{o} (A_{1} ,A_{1} )} & {S_{o} (A_{1} ,A_{2} )} & \cdots & {S_{o} (A_{1} ,A_{N} )} \\ {S_{o} (A_{2} ,A_{1} )} & {S_{o} (A_{2} ,A_{2} )} & \cdots & {S_{o} (A_{2} ,A_{N} )} \\ \vdots & \vdots & \ddots & \vdots \\ {S_{o} (A_{N} ,A_{1} )} & {S_{o} (A_{N} ,A_{2} )} & \cdots & {S_{o} (A_{N} ,A_{N} )} \\ \end{array} } \right] \\ & = \left[ {\begin{array}{*{20}c} {1 - D_{o} (A_{1} ,A_{1} )} & {1 - D_{o} (A_{1} ,A_{2} )} & \cdots & {1 - D_{o} (A_{1} ,A_{N} )} \\ {1 - D_{o} (A_{2} ,A_{1} )} & {1 - D_{o} (A_{2} ,A_{2} )} & \cdots & {1 - D_{o} (A_{2} ,A_{N} )} \\ \vdots & \vdots & \ddots & \vdots \\ {1 - D_{o} (A_{N} ,A_{1} )} & {1 - D_{o} (A_{N} ,A_{2} )} & \cdots & {1 - D_{o} (A_{N} ,A_{N} )} \\ \end{array} } \right] \\ \end{aligned}$$
$$\varvec{S}_{C} = \left[ {\begin{array}{*{20}c} {C_{IFS} (A_{1} ,A_{1} )} & {C_{IFS} (A_{1} ,A_{2} )} & \cdots & {C_{IFS} (A_{1} ,A_{N} )} \\ {C_{IFS} (A_{2} ,A_{1} )} & {C_{IFS} (A_{2} ,A_{2} )} & \cdots & {C_{IFS} (A_{2} ,A_{N} )} \\ \vdots & \vdots & \ddots & \vdots \\ {C_{IFS} (A_{N} ,A_{1} )} & {C_{IFS} (A_{N} ,A_{1} )} & \cdots & {C_{IFS} (A_{N} ,A_{N} )} \\ \end{array} } \right]$$

The cosine similarity can be also written as:

$$\begin{aligned} & C_{IFS} (A_{j} ,A_{k} ) \\ & \quad = \frac{1}{n}\sum\limits_{i = 1}^{n} {\frac{{\mu_{{A_{j} }} (x_{i} )\mu_{{A_{k} }} (x_{i} ) + v_{{A_{j} }} (x_{i} )v_{{A_{k} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{j} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{j} }} (x_{i} )} \right)^{2} } \sqrt {\left( {\mu_{{A_{k} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{k} }} (x_{i} )} \right)^{2} } }}} \\ & \quad = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( \begin{aligned} \frac{{\mu_{{A_{j} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{j} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{j} }} (x_{i} )} \right)^{2} } }}\frac{{\mu_{{A_{k} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{k} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{k} }} (x_{i} )} \right)^{2} } }} \hfill \\ + \frac{{v_{{A_{j} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{j} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{j} }} (x_{i} )} \right)^{2} } }}\frac{{v_{{A_{k} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{k} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{k} }} (x_{i} )} \right)^{2} } }} \hfill \\ \end{aligned} \right)} \\ \end{aligned}$$

So S C can be further decomposed as:

$$\varvec{S}_{C} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\varvec{S}_{\mu } (x_{i} )} + \frac{1}{n}\sum\limits_{i = 1}^{n} {\varvec{S}_{v} (x_{i} )}$$

where for \(i = 1,2, \ldots ,n\),

$$\varvec{S}_{\mu } (x_{i} ) = \left[ {\begin{array}{*{20}c} {F_{1} (x_{i} ) \cdot F_{1} (x_{i} )} & {F_{1} (x_{i} ) \cdot F_{2} (x_{i} )} & \cdots & {F_{1} (x_{i} ) \cdot F_{N} (x_{i} )} \\ {F_{2} (x_{i} ) \cdot F_{1} (x_{i} )} & {F_{2} (x_{i} ) \cdot F_{2} (x_{i} )} & \cdots & {F_{2} (x_{i} ) \cdot F_{N} (x_{i} )} \\ \vdots & \vdots & \ddots & \vdots \\ {F_{N} (x_{i} ) \cdot F_{1} (x_{i} )} & {F_{N} (x_{i} ) \cdot F_{2} (x_{i} )} & \cdots & {F_{N} (x_{i} ) \cdot F_{N} (x_{i} )} \\ \end{array} } \right],$$
$$\varvec{S}_{v} (x_{i} ) = \left[ {\begin{array}{*{20}c} {G_{1} (x_{i} ) \cdot G_{1} (x_{i} )} & {G_{1} (x_{i} ) \cdot G_{2} (x_{i} )} & \cdots & {G_{1} (x_{i} ) \cdot G_{N} (x_{i} )} \\ {G_{2} (x_{i} ) \cdot G_{1} (x_{i} )} & {G_{2} (x_{i} ) \cdot G_{2} (x_{i} )} & \cdots & {G_{2} (x_{i} ) \cdot G_{N} (x_{i} )} \\ \vdots & \vdots & {} & \vdots \\ {G_{N} (x_{i} ) \cdot G_{1} (x_{i} )} & {G_{N} (x_{i} ) \cdot G_{2} (x_{i} )} & \cdots & {G_{N} (x_{i} ) \cdot G_{N} (x_{i} )} \\ \end{array} } \right],$$

with \(F_{j} (x_{i} ) = F\left( {\mu_{{A_{j} }} (x_{i} ),v_{{A_{j} }} (x_{i} } \right) = \frac{{\mu_{{A_{j} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{j} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{j} }} (x_{i} )} \right)^{2} } }}\), and \(G_{j} (x_{i} ) = G\left( {\mu_{{A_{j} }} (x_{i} ),v_{{A_{j} }} (x_{i} )} \right) = \frac{{v_{{A_{j} }} (x_{i} )}}{{\sqrt {\left( {\mu_{{A_{j} }} (x_{i} )} \right)^{2} + \left( {v_{{A_{j} }} (x_{i} )} \right)^{2} } }}\), for \(j = 1,2, \ldots ,N\).

Let \(\varvec{\mu}_{i} = \left[ {F_{1} (x_{i} ),F_{2} (x_{i} ), \ldots ,F_{N} (x_{i} )} \right]\) and \(\varvec{v}_{i} = \left[ {G_{1} (x_{i} ),G_{2} (x_{i} ), \ldots ,G_{N} (x_{i} )} \right]\), then we can get:

$$\varvec{S}_{\mu } (x_{i} ) =\varvec{\mu}_{i}^{{\mathbf{T}}}\varvec{\mu}_{i} ,\,\quad \varvec{S}_{v} (x_{i} ) = \varvec{v}_{i}^{{\mathbf{T}}} \varvec{v}_{i} .$$

So \(\varvec{S}_{\mu } (x_{i} )\) and \(\varvec{S}_{v} (x_{i} )\) are positive semidefinite.

And then \(\varvec{S}_{C} = \frac{1}{n}\sum\nolimits_{i = 1}^{n} {\varvec{S}_{\mu } (x_{i} )} + \frac{1}{n}\sum\nolimits_{i = 1}^{n} {\varvec{S}_{v} (x_{i} )}\) is also positive semidefinite.

Considering S o is positive definite and S C is positive semidefinite, we can conclude that \(\varvec{S}_{F} = \frac{1}{2}\varvec{S}_{o} + \frac{1}{2}\varvec{S}_{C}\) is positive definite.\(\square\)

5 Illustrative examples

5.1 Performance of similarity measure S F

To illustrate the superiority of the proposed similarity measure, a comparison between the proposed similarity measure and all the existing similarity measures is conducted. The comparison is implemented based on the widely used counter-intuitive examples. Table 2 presents the result with p = 1 for \(S_{HB} ,\,S_{e}^{p} ,\,S_{s}^{p} ,\,S_{h}^{p}\) and p = 1 t = 2 for \(S_{t}^{p}\).

Table 2 The comparison of similarity measures (counter-intuitive cases are in bold type)

From Table 2, We can see that \(S_{C} (A,B) = S_{DC} (A,B) = C_{IFS} (A,B) = 1\) for two different IF sets \(A = \left\langle {0.3,0.3} \right\rangle\) and \(B = \left\langle {0.4,0.4} \right\rangle\). This indicates that the second axiom of similarity measure (S2) is not satisfied by \(S_{C} (A,B)\), \(S_{DC} (A,B)\) and \(C_{IFS} (A,B)\). This also can be illustrated by \(S_{C} (A,B) = S_{DC} (A,B) = 1\) when \(A = \left\langle {0.5,0.5} \right\rangle\), \(B = \left\langle {0,0} \right\rangle\) and \(A = \left\langle {0.4,0.2} \right\rangle\), \(B = \left\langle {0.5,0.3} \right\rangle\). As for S H , S O , S HB , \(S_{e}^{p}\), \(S_{s}^{p}\) and \(S_{h}^{p}\), different pairs of A, B may provide the identical results, which cannot satisfy the application of pattern recognition. It can be read from Table 2 that \(S_{HB} = 0.9\) for both \(A = \left\langle {0.3,0.3} \right\rangle\), \(B = \left\langle {0.4,0.4} \right\rangle\) and \(A = \left\langle {0.3,0.4} \right\rangle\), \(B = \left\langle {0.4,0.3} \right\rangle\). Such situation seems to be worse for \(S_{HY}^{1}\), \(S_{HY}^{2}\) and \(S_{HY}^{3}\), where all the cases take the same similarity degree except case 3 and case 4. \(S_{t}^{p}\) seems to be reasonable without any counter-intuitive results, but it bring new problem with the choice of parameters p and t, which is still an open problem. Moreover, we can notice an interesting situation when comparing case 3 and case 4. For three IF sets \(A = \left\langle {1,0} \right\rangle\), \(B = \left\langle {0.5,0.5} \right\rangle\) and \(C = \left\langle {0,0} \right\rangle\), intuitively, it is more reasonable to take the similarity degree between them as: \(S_{F} (A,C) = 0.15\), \(S_{F} (B,C) = 0.25\) than taking \(S_{t}^{p} (A,C) = 0.5\) and \(S_{t}^{p} (B,C) = 0.833\). In such a sense, the proposed similarity measure is the most reasonable one with a relative simple expression, and has none of the counter-intuitive cases. Three IF sets \(A = \left\langle {0.4,0.2} \right\rangle\), \(B = \left\langle {0.5,0.3} \right\rangle\) and \(C = \left\langle {0.5,0.2} \right\rangle\) can be written in forms of interval values as: \(A = \left[ {0.4,0.8} \right]\), \(B = \left[ {0.5,0.7} \right]\) and \(C = \left[ {0.5,0.8} \right]\), respectively. In such a sense, we can say that the similarity degree between A and C is greater than the similarity degree between A and B.

Therefore, our proposed similarity measure is in agreement with this analysis. The proposed similarity measure is the most reasonable similarity measure without any counter-intuitive cases. That is because that our proposed similarity measure combines cosine similarity C IFS and distance-inducing similarity \((1 - D_{o} )\). If we take IF sets \(A = \left\langle {\mu_{A} (x),v_{A} (x)} \right\rangle\) and \(B = \left\langle {\mu_{B} (x),v_{B} (x)} \right\rangle\) as two vectors, \(C_{IFS} (A,B)\) and \(D_{o} (A,B)\) represent the angle which quantifies how orthogonal two IF sets are and the distance between IF sets can quantify how close two IF sets are from each other, respectively. Generally, the relationship between two vectors can be determined by angle and distance. So the combination of \(C_{IFS} (A,B)\) and \((1 - D_{o} (A,B))\) can be applied to define similarity measure for IF sets without any counter-intuitive cases. Moreover, there is no need to determine other parameters. Such analysis is also potentially applicable to any IF sets on arbitrary universe \(X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \}\).

Besides the satisfaction of the definitional axioms and the avoidance of counter-intuitive cases, the discrimination capability of a measure is another important property for similarity measures, which is very useful in pattern recognition applications. To study the effectiveness of the proposed similarity measure for IF sets in the application of pattern recognition, the widely used pattern recognition problem discussed in [8, 17] will be considered.

Suppose there are m patterns, which can be represented by IF sets \(A_{j} = \left\{ {\left\langle {x_{i} ,\mu_{{A_{j} }} (x_{i} ),v_{{A_{j} }} (x_{i} )} \right\rangle \left| {x_{i} \in X} \right.} \right\}\), \(A_{j} \in IFSs(X)\), \(j = 1,2, \ldots ,m\). Let the sample to be recognized be denoted as \(B = \left\{ {\left\langle {x_{i} ,\mu_{B} (x_{i} ),v_{B} (x_{i} )} \right\rangle \left| {x_{i} \in X} \right.} \right\}\). According to the recognition principle of maximum degree of similarity between IF sets, the process of assigning B to A k is described by:

$$k = \arg \mathop {\hbox{max} }\limits_{j = 1,2, \ldots ,m} \{ S(A_{j} ,B)\}$$
(10)

To illustrate the discrimination capability of our proposed similarity measure, comparisons with the measures proposed earlier by other authors will be carried out based on the example analyzed in [22].

Example 1

Assume that there are three IF sets in \(X = \{ x_{1} ,x_{2} ,x_{3} \}\) representing three patterns. The three patterns are written as follows:

$$\begin{aligned} A_{1} = \left\{ {\left\langle {x_{1} ,0.3,0.3} \right\rangle ,\left\langle {x_{2} ,0.2,0.2} \right\rangle ,\left\langle {x_{3} ,0.1,0.1} \right\rangle } \right\}, \hfill \\ A_{2} = \left\{ {\left\langle {x_{1} ,0.2,0.2} \right\rangle ,\left\langle {x_{2} ,0.2,0.2} \right\rangle ,\left\langle {x_{3} ,0.2,0.2} \right\rangle } \right\}, \hfill \\ A_{3} = \left\{ {\left\langle {x_{1} ,0.4,0.4} \right\rangle ,\left\langle {x_{2} ,0.4,0.4} \right\rangle ,\left\langle {x_{3} ,0.4,0.4} \right\rangle } \right\}. \hfill \\ \end{aligned}$$

Assume that a sample \(B = \left\{ {\left\langle {x_{1} ,0.3,0.3} \right\rangle ,\left\langle {x_{2} ,0.2,0.2} \right\rangle ,\left\langle {x_{3} ,0.1,0.1} \right\rangle } \right\}\) is to be classified.

The similarity degrees of \(S(A_{1} ,B)\), \(S(A_{2} ,B)\) and \(S(A_{3} ,B)\) calculated for all similarity measures listed in Table 1 are shown in Table 3.

Table 3 The similarity measures between the known patterns and the unknown pattern in Example (Patterns not discriminated are in bold type

The proposed similarity measure S F can be calculated by Eq. 7 as:

$$S_{F} (A_{1} ,B) = 1,\,\,S_{F} (A_{2} ,B) = 0.959,\,\,S_{F} (A_{3} ,B) = 0.892.$$

It is obvious that B is equal to A 1, which indicates that sample B should be classified to A 1. However, the similarity degrees of \(S(A_{1} ,B)\), \(S(A_{2} ,B)\) and \(S(A_{3} ,B)\) are equal to each other when S C , S H , S DC and C IFS are employed. These four similarity measures cannot capable of discriminating difference between the three patterns. Fortunately, the results of \(S_{F} (A_{i} ,B)\,(i = 1,2,3)\) can be used to make correct classification conclusion. This means that the proposed similarity measure shows an identical discrimination capability with majority of the existing measures.

5.2 Properties of similarity matrix S F

As an illustration of Theorem 6, the positive definiteness of the similarity matrix S F will be verified by number of IF sets. Without any loss of generality, we investigate the similarity matrix S F among a group of five IF sets. For the sake of persuasiveness, all IF sets are generated randomly, with 100 replications. The evolution of the smallest eigenvalue of S F is displayed in Fig. 1. We observe that all the smallest eigenvalues are strictly positive, which indicates that similarity matrices between random IF sets are all positive definite. Such result is identical to the proof of Theorem 6. So we have enough confidence to claim that the similarity matrix defined based on our proposed similarity measure is positive definite.

Fig. 1
figure 1

Evolution of the smallest eigenvalue of S F with 100 replications

6 Conclusion

The definition of similarity measure between two IF sets has been researched for decades. Even though researchers have defined mounts of similarity measures to depict the similarity degree between IF sets, most of them are stuck with counter-intuitive results. After analyzing these similarity measures critically, a new similarity measure is proposed in this paper. To explore the properties of our proposed similarity measure, the similarity matrix is also defined. A comparison between our proposed similarity measure and other existing measures is carried out based on the widely used counter-intuitive cases. It is illustrated that the proposed similarity measure is more reasonable, without any counter-intuitive results. It has also been proved that the similarity matrix defined by our proposed similarity measure is positive definite, which is significant for the application of similarity matrix.

One case worth mentioning is that our proposed similarity measure is not the only similarity measure that can be used to define positive definite similarity matrix. Besides the combination of cosine similarity and Euclidean distance, cosine similarity can be also combined with Hamming distance to define a new similarity in a similar way. So much work remains to be done for a better exploration and exploitation of IF set theory.