1 Introduction

The theory of fuzzy set is proposed by Zadeh [1] in 1965 to describe uncertainty of phenomenon. After that, in 1986, Atanassov proposed the theory of IFS [2]. The IFS is characterized by the membership function and the non-membership function. Therefore, compared with the fuzzy set only characterized by the membership function, the IFS can depict the uncertainty of data more detailedly.

In 2013, Yager [3, 4] introduced the Pythagorean theorem into the theory of IFS and proposed the theory of PFS. As the extension of IFS, PFS has greater representation space for expressing the uncertainty of phenomenon [5]. Therefore, in the past decade, the theory of PFS was developing rapidly. Especially in applications, the theory of PFS has been widely applied in decision making [4, 6,7,8], attribute reduction [9], self-driving vehicle [10], conflict analysis [11], medical waste treatment [12], medical diagnosis [13], and so on.

Similarity measure is an important topic in the theory of fuzzy set and has been applied in various fields [5, 7, 14]. Ever since PFSs’ appearance, many authors have paid attention to the similarity measure between PFSs. Peng et al. [14] proposed 12 similarity measures between PFSs which are derived from distance measures between PFSs. Zeng et al. [7] proposed some similarity measures between PFSs which take into account five parameters of PFSs. Zhang et al. [15] proposed four similarity measures between PFSs based on exponential function. Firozja et al. [16] proposed a construction method of similarity measure between PFSs based on continuous triangular conorms (t-conorms, for short).

Bustince et al. [17, 18] pointed out that the associativity property of triangular norms (t-norms, for short) and t-conorms is not required for many applications, such as decision making and classification problems. In 2010, Bustince et al. [17] proposed the definition of overlap functions that are not necessarily associative binary aggregation functions. After that, in 2012, Bustince et al. [18] proposed the definition of grouping functions. Since then, overlap and grouping functions have been successfully applied in numerous fields, such as classification [19, 20], decision making [21, 22], image processing [23], fuzzy community detection [18], and so on.

It is worth noting that the associativity property of t-conorm does not work in the construction method of similarity measure between PFSs given by [16]. Meanwhile, overlap and grouping function, as two novel binary continuous aggregation functions, are not necessarily associative and widely applied in numerous application problems. In addition, there are some application areas which similarity measure between PFSs and overlap and grouping functions work together, see, e.g., decision making [7, 21, 22] and classification [5, 19, 20]. Therefore, the purpose of this article is to introduce two novel methods to construct similarity measures between PFSs based on overlap functions and grouping functions. In this way, we can not only construct more useful similarity measures between PFSs from the mathematical point of view, but also provide more potential applications of similarity measures between PFSs in actual classification and decision making problems from the application viewpoint.

The rest of this work is organized as follows. In Sect. 2, we review some concepts used in this paper. In Sect. 3, we introduce two novel construction methods of similarity measure between PFSs based on overlap and grouping functions. In Sect. 4, we provide some numerical examples to illustrate the superiority and reasonability of proposed methods. In Sect. 5, we apply the proposed methods to classification and clustering problems. In Sect. 6, this article is summarized.

2 Preliminaries

2.1 Fuzzy Sets, Intuitionistic Fuzzy Sets and Pythagorean Fuzzy Sets

Let U and V be two nonempty universes and I be the unit interval [0, 1]. The notation \({\text {Map}}(U,V)\) denotes the family of all mappings from U to V. For each \(A\in {\text {Map}}(U,I),\) A is called a fuzzy set on U [1].

Let \(I^{2}_{\text {I}}=\{(x_{i},y_{i})\in I\times I\,\vert \,0\le x_{i}+y_{i}\le 1\}\) and \(I^{2}_{\text {P}}=\{(x_{p},y_{p})\in I\times I\,\vert \,0\le x^{2}_{p}+y^{2}_{p}\le 1\}\) where \(I^{2}_{\text {I}}\) and \(I^{2}_{\text {P}}\) denote the set of all intuitionistic fuzzy numbers (IFNs) [24] and Pythagorean fuzzy numbers (PFNs) [6], respectively. Actually, the set \(I_{\text {I}}^{2}\) is a subset of the set \(I_{\text {P}}^{2}\) (see Fig. 1).

Fig. 1
figure 1

The comparison between PFNs and IFNs

Let \(p_{1}=(\alpha _{1},\beta _{1})\) and \(p_{2}=(\alpha _{2},\beta _{2})\) be two PFNs. Denote that

$$\begin{aligned}&p_{1}^{\text {C}}=(\beta _{1},\alpha _{1});\\&p_{1}\le p_{2}\Longleftrightarrow \alpha _{1}\le \alpha _{2} \,\text {and}\,\beta _{1}\ge \beta _{2};\\&p_{1}=p_{2}\Longleftrightarrow \alpha _{1}= \alpha _{2}\,\text {and}\,\beta _{1}= \beta _{2}. \end{aligned}$$

Definition 2.1

(See [2]) Let U be a universe. For each \(A\in {\text {Map}}(U,I^{2}_{\text {I}}),\) A is called an IFS on U where \(A(u)=(\mu _{A}(u),\nu _{A}(u))\) for each \(u\in U,\) \(\mu _{A}(u)\) and \(\nu _{A}(u)\) denote membership degree and non-membership degree, respectively, of u in the IFS A. For each \(u\in U,\) the hesitancy degree of u denoted by \(\pi _{A}(u)\) is defined as \(\pi _{A}(u)=1-\mu _{A}(u)-\nu _{A}(u).\)

Definition 2.2

(See [3]) Let U be a universe. For each \(A\in {\text {Map}}(U,I^{2}_{\text {P}}),\) A is called a PFS on U where \(A(u)=(\mu _{A}(u),\nu _{A}(u))\) for each \(u\in U,\) \(\mu _{A}(u)\) and \(\nu _{A}(u)\) denote membership degree and non-membership degree, respectively, of u in the PFS A. For each \(u\in U,\) the hesitancy degree of u denoted by \(\pi _{A}(u)\) is defined as \(\pi _{A}(u)=\sqrt{1-(\mu _{A}(u))^{2}-(\nu _{A}(u))^{2}}.\)

Remark 2.1

It has been pointed out that the set \(I_{\text {I}}^{2}\) is a subset of the set \(I_{\text {P}}^{2}.\) Therefore, from Definitions 2.1 and 2.2, one has that IFS is a special case of PFS.

Let \(A,B\in {\text {Map}}(U,I^{2}_{\text {P}})\) where \(A(u)=(\mu _{A}(u),\nu _{A}(u))\) and \(B(u)=(\mu _{B}(u),\nu _{B}(u))\) are two PFNs for each \(u\in U.\) Denote that

$$\begin{aligned}&A^{\text {C}}(u)=(A(u))^{\text {C}};\\&A\subseteq B\Longleftrightarrow A(u)\le B(u)\,\text {for each}\,u\in U;\\&A= B\Longleftrightarrow A(u)= B(u)\,\text {for each}\,u\in U. \end{aligned}$$

2.2 Fuzzy Negations, t-Norms, t-Conorms, Overlap and Grouping Functions

Definition 2.3

(See [25]) A non-increasing mapping \(N:I\rightarrow I\) is called a fuzzy negation if \(N(0)=1\) and \(N(1)=0.\) Further, a fuzzy negation N is called strong or involutive if \(N(N(x))=x\) for each \(x\in I.\)

For each \(x\in I,\) \(N_{S}(x)=1-x,\) \(N_{1,\lambda }(x)=\frac{1-x}{1+\lambda x}\,(\lambda \in ]-1,+\infty [)\) and \(N_{2,\varphi }(x)=(1-x^{\varphi })^{\frac{1}{\varphi }}\,(\varphi \in ]0,+\infty [)\) are three types strong fuzzy negations [26] (see Fig. 2).

Fig. 2
figure 2

The curves of some strong fuzzy negations

Lemma 2.1

Let N be a strong fuzzy negation. Then the following statements hold.

  1. (1)

    \(N(x)=1\) if and only if \(x=0;\)

  2. (2)

    \(N(x)=0\) if and only if \(x=1;\)

Proof

  1. (1)

    Let \(N(x)=1.\) Since \(N(1)=0\) and \(N(N(x))=x\) for each \(x\in I,\) it follows that

    $$N(x)=1\Leftrightarrow N(N(x))=N(1) \Leftrightarrow x=0.$$

    Therefore, one concludes that \(N(x)=1\) if and only if \(x=0.\)

  2. (2)

    It can be proved in a similar way as for statement (1).

\(\square\)

Definition 2.4

(See [25]) A mapping \(T:I\times I\rightarrow I\) (resp. \(S:I\times I\rightarrow I\)) is called a t-norm (resp. t-conorm) if it is commutative, is associative, is increasing and has 1 (resp. 0) as the neutral element.

Further, a t-norm T is called continuous if it is continuous in both arguments at the same time. A t-norm T is called positive if \(T(x,y)=0;\) then either \(x=0\) or \(y=0.\)

Definition 2.5

(See [17]) A binary function \(O:I\times I\rightarrow I\) is called an overlap function if, for any \(x,y\in I,\) it satisfies the following conditions:

  1. (O1)

    O is commutative;

  2. (O2)

    \(O(x,y)=0\) if and only if \(xy=0;\)

  3. (O3)

    \(O(x,y)=1\) if and only if \(xy=1;\)

  4. (O4)

    O is increasing;

  5. (O5)

    O is continuous.

We now list some commonly used overlap functions from [27,28,29,30] as follows.

Example 2.1

  1. (1)

    Any continuous and positive t-norm is an overlap function.

  2. (2)

    For any \(p>0,\) the function \(O_{p}:I\times I\rightarrow I\) given, for each \(x,y\in I,\) by

    $$O_{p}(x,y)=x^{p}y^{p}$$

    is an overlap function.

  3. (3)

    The function \(O_{\text {DB}}:I\times I\rightarrow I\) given, for each \(x,y\in I,\) by

    $$O_{\text {DB}}(x,y)=\left\{ \begin{array}{ll} \frac{2xy}{x+y} &{}\text {if}\,x+y\ne 0, \\ 0 &{}\text {if}\,x+y=0, \end{array} \right.$$

    is an overlap function.

  4. (4)

    The function \(O_{+}:I\times I\rightarrow I\) given, for each \(x,y\in I,\) by

    $$O_{+}(x,y)=\frac{2xy}{1+xy}$$

    is an overlap function.

  5. (5)

    The function \(O_{\text {Mid}}:I\times I\rightarrow I\) given, for each \(x,y\in I,\) by

    $$O_{\text {Mid}}(x,y)=xy\frac{x+y}{2}$$

    is an overlap function.

  6. (6)

    The function \(O_{\text {mM}}:I\times I\rightarrow I\) given, for each \(x,y\in I,\) by

    $$O_{\text {mM}}(x,y)=\min \{x,y\}\max \{x^{2},y^{2}\}$$

    is an overlap function.

Definition 2.6

(See [18]) A binary function \(G:I\times I\rightarrow I\) is called a grouping function if, for any \(x,y\in I,\) it satisfies the following conditions:

  1. (G1)

    G is commutative;

  2. (G2)

    \(G(x,y)=0\) if and only if \(x=y=0;\)

  3. (G3)

    \(G(x,y)=1\) if and only if \(x=1\) or \(y=1;\)

  4. (G4)

    G is increasing;

  5. (G5)

    G is continuous.

Lemma 2.2

(See [18]) Let \(N:I\rightarrow I\) be a strong fuzzy negation. Then the following statements are equivalent.

  1. (1)

    A binary function \(G:I\times I\rightarrow I\) is a grouping function.

  2. (2)

    There exists an overlap function O such that, for each \(x,y\in I,\)

    $$G(x,y)=N(O(N(x),N(y))).$$

The grouping function G (resp. overlap function O) given by Lemma 2.2 is called the N-dual grouping function of O (resp. overlap function of G).

Lemma 2.3

(See [18]) Let G be a grouping function. If G is associative, then G becomes a t-conorm.

3 Two Novel Construction Methods of Similarity Measure Between PFSs

To begin with, we review the concept of similarity measure between PFSs.

Definition 3.1

(See [7]) A mapping \({\mathbb {S}}:{\text {Map}}(U,I^{2}_{\text {P}})\times {\text {Map}}(U,I^{2}_{\text {P}})\rightarrow I\) is called a similarity measure between PFSs if, for each \(A,B,C\in {\text {Map}}(U,I^{2}_{\text {P}}),\) it satisfies the following conditions:

  1. (SP1)

    \(0\le {\mathbb {S}}(A,B)\le 1;\)

  2. (SP2)

    \({\mathbb {S}}(A,B)={\mathbb {S}}(B,A);\)

  3. (SP3)

    \({\mathbb {S}}(A,B)=1\) if and only if \(A=B;\)

  4. (SP4)

    If \(A\subseteq B\subseteq C,\) then \({\mathbb {S}}(A,C)\le \min \{{\mathbb {S}}(A,B),{\mathbb {S}}(B,C)\}.\)

3.1 Similarity Measure Derived from Overlap Functions

In this subsection, we propose a novel method to construct similarity measure between PFSs based on overlap functions. To begin with, we propose the similarity measure between PFNs based on overlap functions as follows.

Definition 3.2

Let \(k\in [1,+\infty [,\) \(N_{a}\) and \(N_{b}\) be two strong fuzzy negations and O be an overlap function. The mapping \({\mathbb {S}}_{\text {P}}^{O}:I^{2}_{\text {P}}\times I^{2}_{\text {P}}\rightarrow I\) is defined as

$${\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=\root k \of {O(N_{a}(\vert \alpha _{1}-\alpha _{2}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{2}\vert ^{k}))}$$
(1)

where \(p_{1}=(\alpha _{1},\beta _{1}),p_{2}=(\alpha _{2},\beta _{2})\in I^{2}_{\text {P}}.\)

Theorem 3.1

Let \({\mathbb {S}}_{\text {P}}^{O}\) be the mapping given by Eq. (1). Then, for each \(p_{1}=(\alpha _{1},\beta _{1}),p_{2}=(\alpha _{2},\beta _{2}), p_{3}=(\alpha _{3},\beta _{3})\in I^{2}_{\text {P}},\) the following statements hold.

  1. (1)

    \(0\le {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})\le 1;\)

  2. (2)

    \({\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})={\mathbb {S}}_{\text {P}}^{O}(p_{2},p_{1});\)

  3. (3)

    \({\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=1\) if and only if \(p_{1}=p_{2};\)

  4. (4)

    If \(p_{1}\le p_{2}\le p_{3},\) then \({\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{3})\le \min \left\{ {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2}),{\mathbb {S}}_{\text {P}}^{O}(p_{2},p_{3})\right\} .\)

Proof

  1. (1)

    Since \(\alpha _{1},\alpha _{2},\beta _{1},\beta _{2}\in I,\) one has that \(\vert \alpha _{1}-\alpha _{2}\vert ,\vert \beta _{1}-\beta _{2}\vert \in I.\) Further, the unit interval I is closed under k-power operation for each \(k\in ]0,+\infty [.\) In addition, from Definitions 2.3 and 2.5, since strong fuzzy negations \(N_{a}\) and \(N_{b}\) are two unary operations on I and overlap function O is a binary operation on I,  one obtains that \(0\le {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})\le 1.\)

  2. (2)

    From Eq. (1), it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=&\root k \of {O(N_{a}(\vert \alpha _{1}-\alpha _{2}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{2}\vert ^{k}))}\\ =&\root k \of {O(N_{a}(\vert \alpha _{2}-\alpha _{1}\vert ^{k}),N_{b}(\vert \beta _{2}-\beta _{1}\vert ^{k}))}\\ =&{\mathbb {S}}_{\text {P}}^{O}(p_{2},p_{1}). \end{aligned}$$

    Therefore, one concludes that \({\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})={\mathbb {S}}_{\text {P}}^{O}(p_{2},p_{1}).\)

  3. (3)

    From Eq. (1), it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=1 \Longleftrightarrow&\root k \of {O(N_{a}(\vert \alpha _{1}-\alpha _{2}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{2}\vert ^{k}))}=1\\ \Longleftrightarrow&O(N_{a}(\vert \alpha _{1}-\alpha _{2}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{2}\vert ^{k}))=1. \end{aligned}$$

    Then, by condition (O3) of Definition 2.5, one has that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=1\Longleftrightarrow&N_{a}(\vert \alpha _{1}-\alpha _{2}\vert ^{k})=N_{b}(\vert \beta _{1}-\beta _{2}\vert ^{k})=1. \end{aligned}$$

    Further, from statement (1) of Lemma 2.1, one has that strong fuzzy negation \(N(x)=1\) if and only if \(x=0\). Thus, since \(N_{a}\) and \(N_{b}\) are two strong fuzzy negations, it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=1 \Longleftrightarrow&\vert \alpha _{1}-\alpha _{2}\vert ^{k}=\vert \beta _{1}-\beta _{2}\vert ^{k}=0\\ \Longleftrightarrow&p_{1}=p_{2}. \end{aligned}$$

    Therefore, one concludes that \({\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})=1\) if and only if \(p_{1}=p_{2}.\)

  4. (4)

    Let \(p_{1}=(\alpha _{1},\beta _{1}),p_{2}=(\alpha _{2},\beta _{2}), p_{3}=(\alpha _{3},\beta _{3})\in I^{2}_{\text {P}}\) with \(p_{1}\le p_{2}\le p_{3}.\) Then, since \(k\in [1,+\infty [,\) it follows that

    $$\vert \alpha _{1}-\alpha _{3}\vert ^{k}\ge \max \left\{ \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \alpha _{2}-\alpha _{3}\vert ^{k}\right\}$$

    and

    $$\vert \beta _{1}-\beta _{3}\vert ^{k}\ge \max \left\{ \vert \beta _{1}-\beta _{2}\vert ^{k},\vert \beta _{2}-\beta _{3}\vert ^{k}\right\} .$$

    Further, by Definition 2.3 and condition (O4) of Definition 2.5, since O is increasing and \(N_{a},N_{b}\) are non-increasing, it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{3})=&\root k \of {O(N_{a}(\vert \alpha _{1}-\alpha _{3}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{3}\vert ^{k}))}\\ \le&\root k \of {O(N_{a}(\vert \alpha _{1}-\alpha _{2}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{2}\vert ^{k}))}\\ =&{\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2}) \end{aligned}$$

    and

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{3})=&\root k \of {O(N_{a}(\vert \alpha _{1}-\alpha _{3}\vert ^{k}),N_{b}(\vert \beta _{1}-\beta _{3}\vert ^{k}))}\\ \le&\root k \of {O(N_{a}(\vert \alpha _{2}-\alpha _{3}\vert ^{k}),N_{b}(\vert \beta _{2}-\beta _{3}\vert ^{k}))}\\ =&{\mathbb {S}}_{\text {P}}^{O}(p_{2},p_{3}). \end{aligned}$$

    Therefore, one concludes that \({\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{3})\le \min \left\{ {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2}),{\mathbb {S}}_{\text {P}}^{O}(p_{2},p_{3})\right\} .\)

\(\square\)

For convenience, the mapping \({\mathbb {S}}_{\text {P}}^{O}\) defined in Definition 3.2 is always said to be the O-similarity measure between PFNs.

Definition 3.3

Let \(U=\{u_{1},u_{2},\ldots ,u_{n}\}\) be a nonempty finite universe, \({\mathbb {S}}_{\text {P}}^{O}\) be the O-similarity measure between PFNs given by Eq. (1) and \(\omega =(\omega _{1},\omega _{2},\ldots ,\omega _{n})\in ]0,1]^{n}\) with \(\sum _{i=1}^{n}\omega _{i}=1.\) The mapping \({\mathbb {S}}^{O}:{\text {Map}}(U,I^{2}_{\text {P}})\times {\text {Map}}(U,I^{2}_{\text {P}})\rightarrow I\) is defined as

$${\mathbb {S}}^{O}(A,B)=\sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{O}(A(u_{i}),B(u_{i}))$$
(2)

where \(A,B\in {\text {Map}}(U,I^{2}_{\text {P}}).\)

In the following, \(\omega =(\omega _{1},\omega _{2},\ldots ,\omega _{n})\in ]0,1]^{n}\) is always said to be the weight vector of universe.

Theorem 3.2

Let \(U=\{u_{1},u_{2},\ldots ,u_{n}\}\) be a nonempty finite universe and \({\mathbb {S}}^{O}\) be the mapping given by Eq. (2). Then, for each \(A,B,C\in {\text{Map}}(U,I^{2}_{\text {P}}),\) the following statements hold.

  1. (1)

    \(0\le {\mathbb {S}}^{O}(A,B)\le 1;\)

  2. (2)

    \({\mathbb {S}}^{O}(A,B)={\mathbb {S}}^{O}(B,A);\)

  3. (3)

    \({\mathbb {S}}^{O}(A,B)=1\) if and only if \(A=B;\)

  4. (4)

    If \(A\subseteq B \subseteq C,\) then \({\mathbb {S}}^{O}(A,C)\le \min \{{\mathbb {S}}^{O}(A,B),{\mathbb {S}}^{O}(B,C)\}.\)

Proof

Statements (1) and (2) can be immediately derived from Eqs. (1) and (2) and statements (1) and (2) of Theorem 3.1. Therefore, we only verify statements (3) and (4) as follows.

  1. (C)

    Since \(\omega _{i}\in ]0,1],\) from Eqs. (1) and (2), it follows that

    $$\begin{aligned} {\mathbb {S}}^{O}(A,B)=1 \Longleftrightarrow&\sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{O}(A(u_{i}),B(u_{i}))=1\\ \Longleftrightarrow&{\mathbb {S}}_{\text {P}}^{O}(A(u_{i}),B(u_{i}))=1\quad \text {for}\,i=1,2,\ldots ,n. \end{aligned}$$

    Further, from statement (3) of Theorem 3.1, it follows that

    $${\mathbb {S}}^{O}(A,B)=1 \Longleftrightarrow A(u_{i})=B(u_{i}),$$

    for each \(u_{i}\in U.\) Therefore, one concludes that \({\mathbb {S}}^{O}(A,B)=1\) if and only if \(A=B.\)

  2. (D)

    Let \(A,B,C\in {\text {Map}}(U,I^{2}_{\text {P}})\) with \(A\subseteq B \subseteq C.\) Then, from statement (4) of Theorem 3.1, one obtains that

    $$\begin{aligned} {\mathbb {S}}^{O}_{\text {P}}(A(u_{i}),C(u_{i})) \le&\min \left\{ {\mathbb {S}}^{O}_{\text {P}}(A(u_{i}),B(u_{i})),{\mathbb {S}}^{O}_{\text {P}}(B(u_{i}),C(u_{i}))\right\} \end{aligned}$$

    for each \(u_{i}\in U.\) Thus, for each \(u_{i}\in U,\) it follows that

    $$\sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{O}(A(u_{i}),C(u_{i})) \le \sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{O}(A(u_{i}),B(u_{i}))$$

    and

    $$\sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{O}(A(u_{i}),C(u_{i})) \le \sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{O}(B(u_{i}),C(u_{i})),$$

    respectively. Therefore, one concludes that \({\mathbb {S}}^{O}(A,C)\le \min \{{\mathbb {S}}^{O}(A,B),{\mathbb {S}}^{O}(B,C)\}.\)

\(\square\)

From Theorem 3.2, one has that the mapping \({\mathbb {S}}^{O}\) given by Definition 3.3 is a similarity measure between PFSs. The mapping \({\mathbb {S}}^{O}\) is always said to be O-similarity measure between PFSs.

Remark 3.1

In Definition 3.3, \(\omega _{i}\ne 0\) for each \(i\in \{1,2,\ldots ,n\}\) is important and necessary. Consider the following case. Let \(U=\{u_{1},u_{2}\},\) \(\omega =(0,1),\) \({\mathbb {S}}^{O}\) be the O-similarity measure between PFSs and \(A,B\in {\text {Map}}(U,I^{2}_{\text {P}})\) where \(A=\frac{(0.5,0.5)}{u_{1}}+\frac{(0.5,0.5)}{u_{2}}\) and \(B=\frac{(1,0)}{u_{1}}+\frac{(0.5,0.5)}{u_{2}}.\) Then, by Eqs. (1) and (2), it follows that

$${\mathbb {S}}^{O}(A,B)={\mathbb {S}}^{O}_{\text {P}}(A(u_{2}),B(u_{2}))=1.$$

However, A is not equivalent to B. Therefore, under this case, \({\mathbb {S}}^{O}\) not satisfies condition (SP3) of Definition 3.1, that is, \({\mathbb {S}}^{O}\) is not a similarity measure between PFSs.

3.2 Similarity Measure Derived from Grouping Functions

Definition 3.4

Let \(k\in [1,+\infty [,\) N be a strong fuzzy negation and G be a grouping function. The mapping \({\mathbb {S}}_{\text {P}}^{G}:I^{2}_{\text {P}}\times I^{2}_{\text {P}}\rightarrow I\) is defined as

$${\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) \right) },$$
(3)

where \(p_{1}=(\alpha _{1},\beta _{1}),p_{2}=(\alpha _{2},\beta _{2})\in I^{2}_{P}.\)

Theorem 3.3

Let \({\mathbb {S}}_{\text {P}}^{G}\) be the mapping given by Eq. (3). For each \(p_{1}=(\alpha _{1},\beta _{1}),p_{2}=(\alpha _{2},\beta _{2}),p_{3}=(\alpha _{3},\beta _{3})\in I^{2}_{P},\) the following statements hold.

  1. (1)

    \(0\le {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})\le 1;\)

  2. (2)

    \({\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})={\mathbb {S}}_{\text {P}}^{G}(p_{2},p_{1});\)

  3. (3)

    \({\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=1\) if and only if \(p_{1}=p_{2};\)

  4. (4)

    If \(p_{1}\le p_{2}\le p_{3},\) then \({\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{3})\le \min \left\{ {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2}),{\mathbb {S}}_{\text {P}}^{G}(p_{2},p_{3})\right\} .\)

Proof

  1. (1)

    From Definitions 2.3 and 2.6, it can be proved in a similar way as for statement (1) of Theorem 3.1.

  2. (2)

    From Eq. (3), it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=&\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) \right) }\\ =&\root k \of {N\left( G\left( \vert \alpha _{2}-\alpha _{1}\vert ^{k},\vert \beta _{2}-\beta _{1}\vert ^{k}\right) \right) }\\ =&{\mathbb {S}}_{\text {P}}^{G}(p_{2},p_{1}). \end{aligned}$$

    Therefore, one concludes that \({\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})={\mathbb {S}}_{\text {P}}^{G}(p_{2},p_{1}).\)

  3. (3)

    From Eq. (3), it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=1 \Longleftrightarrow &\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) \right) }=1 \\ \Longleftrightarrow & N\left( G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) \right) =1. \end{aligned}$$

    Further, from statement (1) of Lemma 2.1, one has that strong fuzzy negation \(N(x)=1\) if and only if \(x=0\). Therefore, from condition (G2) of Definition 2.6, since N is a strong fuzzy negation, it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=1 \Longleftrightarrow & G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) =0 \\ \Longleftrightarrow &\vert \alpha _{1}-\alpha _{2}\vert ^{k}=\vert \beta _{1}-\beta _{2}\vert ^{k}=0\\ \Longleftrightarrow & p_{1}=p_{2}. \end{aligned}$$

    Therefore, one concludes that \({\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=1\) if and only if \(p_{1}=p_{2}.\)

  4. (4)

    Let \(p_{1}=(\alpha _{1},\beta _{1}),p_{2}=(\alpha _{2},\beta _{2}), p_{3}=(\alpha _{3},\beta _{3})\in I^{2}_{\text {P}}\) with \(p_{1}\le p_{2}\le p_{3}.\) Then, since \(k\in [1,+\infty [,\) it follows that

    $$\vert \alpha _{1}-\alpha _{3}\vert ^{k}\ge \max \left\{ \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \alpha _{2}-\alpha _{3}\vert ^{k}\right\}$$

    and

    $$\vert \beta _{1}-\beta _{3}\vert ^{k}\ge \max \left\{ \vert \beta _{1}-\beta _{2}\vert ^{k},\vert \beta _{2}-\beta _{3}\vert ^{k}\right\} .$$

    Further, from Definition 2.3 and condition (G4) of Definition 2.6, since G is increasing and N is non-increasing, it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{3})=&\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{3}\vert ^{k},\vert \beta _{1}-\beta _{3}\vert ^{k}\right) \right) }\\ \le&\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) \right) }\\ =&{\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2}) \end{aligned}$$

    and

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{3})=&\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{3}\vert ^{k},\vert \beta _{1}-\beta _{3}\vert ^{k}\right) \right) }\\ \le&\root k \of {N\left( G\left( \vert \alpha _{2}-\alpha _{3}\vert ^{k},\vert \beta _{2}-\beta _{3}\vert ^{k}\right) \right) }\\ =&{\mathbb {S}}_{\text {P}}^{G}(p_{2},p_{3}). \end{aligned}$$

    Therefore, one concludes that \({\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{3})\le \min \left\{ {\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2}),{\mathbb {S}}_{\text {P}}^{G}(p_{2},p_{3})\right\} .\)

\(\square\)

For convenience, the mapping \({\mathbb {S}}^{G}_{\text {P}}\) defined in Definition 3.4 is always said to be the G-similarity measure between PFNs.

Remark 3.2

  1. (1)

    In Definition 3.4, if we take \(N=N_{S},\) \(k=2\) and G as an associative grouping function, then, by Lemma 2.3, it follows that

    $${\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2})=\sqrt{1-S\left( \vert \alpha _{1}-\alpha _{2}\vert ^{2},\vert \beta _{1}-\beta _{2}\vert ^{2}\right) },$$
    (4)

    that is, the G-similarity measure \({\mathbb {S}}^{G}_{\text {P}}(p_{1},p_{2})\) becomes the s-similarity measure between PFNs given by Definition 3.1 of [16].

  2. (2)

    In Definitions 3.2 and 3.4, if we take \(N_{a}=N_{b}=N\) and O as N-dual overlap function of grouping function G,  then, by Lemma 2.2, it follows that

    $$\begin{aligned} {\mathbb {S}}_{\text {P}}^{O}(p_{1},p_{2})&=\root k \of {O(N(\vert \alpha _{1}-\alpha _{2}\vert ^{k}),N(\vert \beta _{1}-\beta _{2}\vert ^{k}))}\\&=\root k \of {N\left( G\left( \vert \alpha _{1}-\alpha _{2}\vert ^{k},\vert \beta _{1}-\beta _{2}\vert ^{k}\right) \right) }\\&={\mathbb {S}}_{\text {P}}^{G}(p_{1},p_{2}). \end{aligned}$$

    Thus, we establish the contact between O-similarity measure \({\mathbb {S}}^{O}_{\text {P}}\) and G-similarity measure \({\mathbb {S}}^{G}_{\text {P}}\) by Lemma 2.2. Therefore, in the following, we only discuss O-similarity measure.

Definition 3.5

Let \(U=\{u_{1},u_{2},\ldots ,u_{n}\}\) be a nonempty finite universe, \({\mathbb {S}}_{\text {P}}^{G}\) be the G-similarity measure between PFNs given by Eq. (3) and \(\omega =(\omega _{1},\omega _{2},\ldots ,\omega _{n})\in ]0,1]^{n}\) with \(\sum _{i=1}^{n}\omega _{i}=1.\) The mapping \({\mathbb {S}}^{G}:{\text {Map}}(U,I^{2}_{\text {P}})\times {\text {Map}}(U,I^{2}_{\text {P}})\rightarrow I\) is defined as

$${\mathbb {S}}^{G}(A,B)=\sum _{i=1}^{n}\omega _{i}{\mathbb {S}}_{\text {P}}^{G}(A(u_{i}),B(u_{i})),$$
(5)

where \(A,B\in {\text {Map}}(U,I^{2}_{\text {P}}).\)

Theorem 3.4

Let \(U=\{u_{1},u_{2},\ldots ,u_{n}\}\) be a nonempty finite universe and \({\mathbb {S}}^{G}\) be the mapping given by Eq. (5). For each \(A,B,C\in {\text{Map}}(U,I^{2}_{\text {P}}),\) the following statements hold.

  1. (1)

    \(0\le {\mathbb {S}}^{G}(A,B)\le 1;\)

  2. (2)

    \({\mathbb {S}}^{G}(A,B)={\mathbb {S}}^{G}(B,A);\)

  3. (3)

    \({\mathbb {S}}^{G}(A,B)=1\) if and only if \(A=B;\)

  4. (4)

    If \(A\subseteq B \subseteq C,\) then \({\mathbb {S}}^{G}(A,C)\le \min \left\{ {\mathbb {S}}^{G}(A,B),{\mathbb {S}}^{G}(B,C)\right\} .\)

Proof

From Definitions 3.4 and 3.5 and Theorem 3.3, it can be proved in a similarity way as for Theorem 3.2. \(\square\)

From Theorem 3.4, one has that the mapping \({\mathbb {S}}^{G}\) given by Definition 3.5 is a similarity measure between PFSs. The mapping \({\mathbb {S}}^{G}\) is always said to be G-similarity measure between PFSs.

4 Numerical Examples

In this section, to verify the properties of O-similarity between PFSs, we expound some numerical examples.

Example 4.1

Let \(U=\{u\}\) be a universe, \(\alpha ,\beta \in [0,1]\) with \(0\le \alpha ^{2}+\beta ^{2}\le 1,\) and \(A,B\in {\text {Map}}(U,I^{2}_{\text {P}})\) where \(A=\frac{(\alpha ,\beta )}{u}\) and \(B=\frac{(\beta ,\alpha )}{u}.\)

If we take overlap function O as \(O_{+},\) strong fuzzy negations \(N_{a}\) and \(N_{b}\) as \(N_{s}\) and \(k=2,\) the degree of similarity between PFSs A and B calculated by \({\mathbb {S}}^{O_{+}}\) are shown in Fig. 3. We now show some analyses to this example from Fig. 3 as follows.

  1. (1)

    \({\mathbb {S}}^{O_{+}}(A,B)\) gets the maximum value 1 if and only if \(A=B\) (\(\alpha =\beta\));

  2. (2)

    If \(\alpha =0,\beta =1\) or \(\alpha =1,\beta =0,\) then \({\mathbb {S}}^{O_{+}}(A,B)\) gets the minimum value 0;

  3. (3)

    For each \(\alpha ,\beta \in [0,1]\) with \(0\le \alpha ^{2}+\beta ^{2}\le 1,\) \({\mathbb {S}}^{O_{+}}(A,B)\in [0,1];\)

  4. (4)

    \({\mathbb {S}}^{O_{+}}(A,B)={\mathbb {S}}^{O_{+}}(B,A),\) that is, \({\mathbb {S}}^{O_{+}}\) is commutative.

Thus, Example 4.1 shows that the O-similarity measure \({\mathbb {S}}^{O_{+}}\) satisfies conditions (SP1)–(SP3) of Definition 3.1.

Fig. 3
figure 3

The similarity measure of PFSs given in Example 4.1

Example 4.2

Let \(U=\{u\}\) be a universe and \(A,B,C\in {\text {Map}}(U,I^{2}_{\text {P}})\) where \(A=\frac{(0.1,0.9)}{u},\) \(B=\frac{(0.4,0.8)}{u}\) and \(C=\frac{(0.6,0.6)}{u},\) respectively.

It is obvious that \(A\subseteq B \subseteq C.\) If we take overlap function O as \(O_{\text {DB}},\) strong fuzzy negation \(N_{a}\) as \(N_{s}\), strong fuzzy negation \(N_{b}\) as \(N_{1,2}\) and \(k=1,\) then one has that \({\mathbb {S}}^{O_{\text {DB}}}(A,C)=0.4667,\) \({\mathbb {S}}^{O_{\text {DB}}}(A,B)=0.7241\) and \({\mathbb {S}}^{O_{\text {DB}}}(B,C)=0.6667,\) respectively. Thus, one obtains that \({\mathbb {S}}^{O_{\text {DB}}}(A,C)\le \min \{{\mathbb {S}}^{O_{\text {DB}}}(A,B),{\mathbb {S}}^{O_{\text {DB}}}(B,C)\}.\)

Thus, Example 4.2 shows that the O-similarity measure \({\mathbb {S}}^{O_{\text {DB}}}\) satisfies condition (SP4) of Definition 3.1.

Further, we show the superiority and reasonability of O-similarity measure as follows. Let \(U=\{u_{1},u_{2},\ldots ,u_{n}\}\) be a universe and \(A,B\in {\text {Map}}(U,I^{2}_{\text {P}})\) where \(A(u_{i})=(\mu _{A}(u_{i}),\nu _{A}(u_{i}))\) and \(B(u_{i})=(\mu _{B}(u_{i}),\nu _{B}(u_{i}))\) for each \(u_{i}\in U.\) Some existing similarity measure between A and B are listed in Table 1.

Table 1 Some existing similarity measures between PFSs

Example 4.3

This numerical example is adopted from [36] which includes six cases. All PFSs in this example are on universe \(U=\{u\}.\) The degree of similarity between PFSs calculated by existing methods are shown in Table 2.

Then, we construct some O-similarity measures to calculate the degree of similarity of these cases. In this example, we always take \(k=2\) and \(N_{a}\) as \(N_{s}.\) If we determine the strong fuzzy negation \(N_{b},\) then the corresponding O-similarity measure is denoted \({\mathbb {S}}_{N_{b}}^{O}.\) Then the degree of similarity between PFSs calculated by different O-similarity measures are shown in Table 3.

In the end, we show some analyses based on Tables 2 and 3.

  1. (1)

    For case 1, the degree of similarity between PFSs A and B calculated by \({\mathbb {S}}_{\text {C}}\) and \({\mathbb {S}}_{\text {P}}^{2}\) are 1. However, A is not equivalent to B. Thus, the O-similarity measures used in this example are better than \({\mathbb {S}}_{\text {C}}\) and \({\mathbb {S}}_{\text {P}}^{2}\) under case 1.

  2. (2)

    For cases 1 and 2, the results calculated by \({\mathbb {S}}_{\text {HK}}\) are same. However, cases 1 and 2 are different. Meanwhile, the degree of similarity of cases 1 and 2 cannot be calculated by \({\mathbb {S}}_{\text {Y}}\) due to “the division by zero problem”. Thus, the O-similarity measures used in this example are better than \({\mathbb {S}}_{\text {HK}}\) and \({\mathbb {S}}_{\text {Y}}\) under cases 1 and 2.

  3. (3)

    For cases 3 and 4, the similarity measures \({\mathbb {S}}_{{\text {HY}}1},\) \({\mathbb {S}}_{{\text {HY}}2},\) \({\mathbb {S}}_{{\text {HY}}3},\) \({\mathbb {S}}_{\text {C}},\) \({\mathbb {S}}_{\text {L}},\) \({\mathbb {S}}_{\text {HK}},\) \({\mathbb {S}}^{O_{1}}_{N_{s}},\) \({\mathbb {S}}^{O_{2}}_{N_{s}},\) \({\mathbb {S}}^{O_{0.5}}_{N_{s}},\) \({\mathbb {S}}^{O_{+}}_{N_{s}},\) \({\mathbb {S}}^{O_{\text {DB}}}_{N_{s}},\) \({\mathbb {S}}^{O_{\text {mM}}}_{N_{s}}\) and \({\mathbb {S}}^{O_{\text {Mid}}}_{N_{s}}\) cannot distinguish these two different cases. Thus, these similarity measures need to improved. It is worth noting that the other O-similarity measures used in this example can distinguish these two different cases. That is why two strong fuzzy negations \(N_{a}\) and \(N_{b}\) are considered in Definition 3.2.

  4. (4)

    For cases 5 and 6, O-similarity measures cannot distinguish these two two cases. Thus, under these cases, some existing similarity measures are better than O-similarity measures.

Table 2 The degree of similarity between PFSs calculated by existing methods in Example 4.3
Table 3 The degree of similarity between PFSs calculated by O-similarity measures in Example 4.3

Example 4.4

In this example, we consider five special cases. In this example, all PFSs are on universe \(U=\{u\}.\) Further, we always take \(k=1\) and strong fuzzy negations \(N_{a}=N_{b}=N_{s}.\) Then the degree of similarity between PFSs calculated by different similarity measures are shown in Table 4. We now show some analyses based on Table 4 as follows:

For cases 1 and 2, the similarity measures \({\mathbb {S}}_{\text {P}}^{1},\) \({\mathbb {S}}_{\text {P}}^{2},\) \({\mathbb {S}}_{\text {P}}^{3},\) \({\mathbb {S}}_{\text {P}}^{4},\) \({\mathbb {S}}_{\text {P}}^{5},\) \({\mathbb {S}}_{\text {Z}}^{1},\) \({\mathbb {S}}_{\text {Z}}^{2},\) \({\mathbb {S}}_{\text {Z}}^{3}\) and \({\mathbb {S}}_{\text {P}}^{4}\) cannot distinguish these two different cases, although some of them consider the degree of hesitancy. For cases 3, 4 and 5, the same problem arises with these similarity measures. Therefore, O-similarity measures are better than these similarity measures under these cases.

Table 4 The degree of similarity between PFSs in Example 4.4

5 Applications of O-Similarity Measure in Classification and Clustering Problems

In this section, we design a series of experiments to verify the performance of O-similarity measure. There experiments are implemented on MATLAB R2016a, which is installed on a private computer with 64-bit Windows 10 and Intel Core i5-8300H CPU at 2.30 GHz, 16.0 GB RAM.

5.1 Application in Classification on the Basis of a Classical Example

Let \(U=\{u_{1},u_{2},u_{3},u_{4}\}\) be a universe and \(P_{i}\) be known target with class label \(C_{i}\in {\text {Map}}(U,I^{2}_{\text {P}})\) (\(i=1,2,4\)) where

$$\begin{aligned}&C_{1}=\frac{(0.3,0.3)}{u_{1}}+\frac{(0.4,0.4)}{u_{2}}+\frac{(0.4,0.4)}{u_{3}}+\frac{(0.4,0.4)}{u_{4}},\\&C_{2}=\frac{(0.5,0.5)}{u_{1}}+\frac{(0.1,0.1)}{u_{2}}+\frac{(0.5,0.5)}{u_{3}}+\frac{(0.1,0.1)}{u_{4}}\,\text {and}\\&C_{3}=\frac{(0.5,0.4)}{u_{1}}+\frac{(0.4,0.5)}{u_{2}}+\frac{(0.3,0.3)}{u_{3}}+\frac{(0.2,0.2)}{u_{4}}, \end{aligned}$$

respectively. The unknown target Q is given by

$$Q=\frac{(0.4,0.4)}{u_{1}}+\frac{(0.5,0.5)}{u_{2}}+\frac{(0.2,0.2)}{u_{3}}+\frac{(0.3,0.3)}{u_{4}}.$$

The goal is to identify which class Q belongs to. This example is also used in [14]. In this subsection, we always take strong fuzzy negations \(N_{a}\) and \(N_{b}\) as \(N_{s},\) \(\omega =(0.25,0.25,0.25,0.25)\) and \(k=1.\) Then the classified result of different similarity measures are shown in Table 5.

From Table 5, we show some analyses for this example as follows. Most similarity measures can determine the classified results and have the same classified result. There are only three similarity measures cannot determine the classified results, such as \({\mathbb {S}}_{\text {Y}}\), \(\mathbb {C}\) and \({\mathbb {S}}_{\text {P}}^{2}\). This result shows that the O-similarity measure is reasonable and effective, as well as being better superior to some similarity measures.

Table 5 The classified results of different similarity measures in Sect. 5.1

5.2 Application in Classification on the Basis of Six Datasets from the UCI Database

In this section, we show the effectiveness of the O-similarity measures by repeating 2-Fold Cross Validation in the environment of six datasets. The six datasets are downloaded from the UCI machine learning data repository (http://archive.ics.uci.edu/ml). The detailed information of the datasets is described in Table 6.

Table 6 Basic information of experimental datasets

We use the method given by [5] to generate PFSs for six datasets. Then we apply various existing similarity measure and O-similarity measures to the classification method given by [5] (see Algorithm 1) under fuzzy environment and repeat 100 times 2-Fold Cross Validation for each experiments. In this subsection, we always take strong fuzzy negations \(N_{a}\) and \(N_{b}\) as \(N_{s},\) \(k=1\) and \(\omega _{i}=\frac{1}{n}\) for \(i=1,2,\ldots ,n\) whenever the universe \(U=\{u_{1},u_{2},\ldots ,u_{n}\}.\) The classification results are shown in Table 7.

figure d
Table 7 The classified accuracy of different similarity measures in subsection 5.2

We now show some analyses on the basis of Table 7 as follows:

  1. (1)

    For datasets “Wine”, “Blood” and “Banknote”, there is always one O-similarity measure which has the highest accuracy of classification in compared with some existing similarity measures.

  2. (2)

    For datasets “Seed”, “Haberman” and “Wireless”, although the highest accuracy of classification is always given by \({\mathbb {S}}_{\text {Y}},\) there is always one O-similarity measure \({\mathbb {S}}^{O}\) such that the accuracy of classification given by \({\mathbb {S}}^{O}\) is second only to \({\mathbb {S}}_{\text {Y}}.\)

Therefore, on the basis of the above analyses, the performance of the O-similarity measures is superior to other methods under some conditions.

5.3 Application in Clustering on the Basis of a Classical Example

In this subsection, we use the example given by [14] to illustrate the application of O-similarity measure in clustering. To facilitate understanding, we briefly introduce the dataset as follows.

Software project [14]: The dataset contains ten software projects \(P_{i}\) (\(i=1,2,\ldots ,10\)) and five criteria such as economic feasibility \(u_{1},\) technological feasibility \(u_{2},\) staff feasibility \(u_{3},\) period feasibility \(u_{4}\) and legal feasibility \(u_{5}.\) The weight vector of five criteria is given by \(\omega =(0.1,0.15,0.2,0.3,0.25).\) The data of evaluation information given by experts are represented by the PFSs. The details of this dataset are shown Table 8.

Table 8 The evaluation information of ten software projects [14]

In this subsection, we always take overlap function O as \(O_{+}\), strong fuzzy negations \(N_{a}\) and \(N_{b}\) as \(N_{s}\) and \(k=1.\) According to classification algorithm given by [14], the clustering results of the ten software projects calculated by \({\mathbb {S}}^{O_{+}}\) are shown in Table 9. It is worth noting that clustering result with seven clusters \(\{P_{1}\},\) \(\{P_{2},P_{5}\},\) \(\{P_{3}\},\) \(\{P_{4}\},\) \(\{P_{6}\},\) \(\{P_{7}\},\) \(\{P_{8},P_{9},P_{10}\}\) is supported by [14]. In addition, we obtain the reasonable results only through three iterations. The number of iterations is equal to Peng’s work [14] in the clustering algorithm.

Table 9 The clustering result of ten software projects in Sect. 5.3

5.4 Application in Clustering on the Basis of Datasets from the UCI Database

In this subsection, we show the effective of the O-similarity measures in clustering problems. We apply different similarity measures to six datasets which are used in Sect. 5.2. We repeat 100 times 2-Fold Cross Validation for each dataset. We use the method given by [5] to generate PFSs for datasets and use the classical k-means algorithm to cluster the generated PFSs. In this subsection, we always take strong fuzzy negations \(N_{a}\) as \(N_{s},\) strong fuzzy negation \(N_{b}\) as \(N_{1,1},\) \(k=1\) and \(\omega _{i}=\frac{1}{n}\) for \(i=1,2,\ldots ,n\) whenever the universe \(U=\{u_{1},u_{2},\ldots ,u_{n}\}.\) We analyze the performance of similarity measures listed in Table 1 and O-similarity measures by purity of clustering results. The experiments results are shown in Table 10.

We now show some analyses on the basis of Table 10 as follows:

  1. (1)

    For datasets “Wine”, “Banknote” and “Wireless”, there ia always one O-similarity measure which has the highest purity of clustering in compared with some existing similarity measures.

  2. (2)

    For dataset “Seed”, although the highest purity of clustering is given by \({\mathbb {S}}_{\text {Y}},\) the second highest purity of clustering is given by \({\mathbb {S}}^{O_{0.5}}\) and \({\mathbb {S}}^{O_{\text {DB}}}.\) Meanwhile, the difference between the two is only 0.0093.

  3. (3)

    For dataset “Haberman”, the highest purity of clustering is given by \({\mathbb {S}}_{\text {P}}^{1}.\) The highest purity of clustering calculated by O-similarity measures is given by \({\mathbb {S}}^{O_{2}}.\) The difference between the two is only 0.0008.

  4. (4)

    For dataset “Blood”, the highest purity of clustering is given by \({\mathbb {S}}_{\text {Y}}\) and \({\mathbb {S}}_{\text {P}}^{1}.\) The highest purity of clustering calculated by O-similarity measures is given by \({\mathbb {S}}^{O_{0.5}}\) and \({\mathbb {S}}^{O_{\text {DB}}}.\) The difference between the two is only 0.0002.

Therefore, on the basis of the above analyses, the performance of the O-similarity measures is superior to others methods under some conditions.

Table 10 The purity of clustering of different similarity measures in Sect. 5.4

6 Conclusion

We mainly study the construction methods of similarity measure between PFSs based on overlap and grouping functions. To be precise, we achieved the followings:

  1. (1)

    We proposed two construction methods of similarity measures between PFSs, called O-similarity measure and G-similarity measure, based on overlap and grouping functions. The connection between O-similarity measure and G-similarity measure was established. We also established the connection between G-similarity measure and s-similarity measure derived from t-conorms.

  2. (2)

    We used four numerical examples to illustrate the effectiveness, reliability and deficiencies of O-similarity measure by contrast with other existing similarity measures.

  3. (3)

    We applied some O-similarity measures to classification and clustering problems. On the one hand, we showed the application of O-similarity measure in classification and clustering by using classical examples. On the other hand, we used six datasets from UCI database to illustrate the effectiveness and reliability by comparing with some existing similarity measures. The experimental results showed that O-similarity measure had good performance in data-driven environments.

In the further work, on the one hand, we will research the construction methods of similarity measures between PFSs based on other aggregation functions, such as uninorms [37]. On the other hand, we will study the differences among the O-similarity measures constructed through different overlap functions, strong fuzzy negations and parameter k.