1 Introduction

Distance and similarity measures are two basic measures in hesitant fuzzy set (HFS) theory and they have been applied in many fields such as decision making [13, 15, 32], machine learning, and clustering analysis [17, 28], etc. Hesitant fuzzy set [22], derived from fuzzy set [29], was first introduced by Torra and Narukawa [23], and still in a high-speed development [10, 14, 16]. The HFS illustrates a kind of condition that the membership degree of an element to a set obtains several values between 0 and 1, and each value is an independent membership degree. Normally we prefer applying HFSs in such conditions [23, 25, 26]:

  1. 1.

    Decision makers have hesitations in making decisions;

  2. 2.

    Several decision makers cannot convince each other while estimating;

  3. 3.

    Some data are uncertain or incomplete.

Extended from HFSs, Chen et al. [6] further developed the interval-valued hesitant fuzzy sets (IVHFSs). In view of the significance of the distance and similarity measures, some widely used distance measures, such as the Hamming distance, the normalized Hamming distance, the Euclidean distance and the normalized Euclidean distance had been developed for fuzzy sets [5, 9, 12]. Later, Xu and Xia [27] developed the axioms of distance and similarity measures for HFSs and further proposed a series of distance measures for HFSs.

When computing the distance or similarity between the HFSs, the traditional hesitant fuzzy distance and similarity measures prefer only taking the values of all the attributes into consideration, while the subjective evaluations of the importance degrees of different attributes and the preference relationships between different alternative options are neglected [27]. Nosofsky [18] firstly proposed that every dimension in the attribute space can obtain a unique attention given by the individuals because of their preferences for different attributes. Dimensions can be stretched or shrunk along with the attention which the individuals put on [4]. When considering the relationships between the alternative options, Huber et al. [11] proposed that there are two kinds of directions from one alternative option to another: the dominance direction and indifference direction. When calculating the distances between the options, the distances in the dominance direction should be assigned a higher weight than the ones in the indifference direction.

According to the papers [1, 24], the dominance direction and the indifference one can be rotated when the weight distribution of attributes being changed. It is apparent that no matter how the weight distribution changes, the two directions are orthogonal. Rooderkerk et al. [20] and Berkowitsch et al. [1] suggested that we should give the distance in the dominance direction a high individual weight, while the weights of the distances in the indifference directions should be relatively lower. This method can well expand the influence of the dominance direction in the calculation of the psychological distances, so that the preferences from the experts for some certain alternatives can be shown. Inspired by the past researches, Berkowitsch et al. [1] presented a generalized psychological distance which can consider the preferential relationship between the options.

Clustering is an unsupervised classification of patterns. This method can organize samples with a high degree of similarity into a cluster. It is quite useful and has been applied in many fields such as information retrieval and data mining [8], etc. Traditional clustering methods are hard clustering which can’t deal with the fuzzy information. After Zadeh proposed the fuzzy set, Ruspini first studied the fuzzy clustering problems [21], and then several fuzzy clustering methods were proposed, such as fuzzy C-means (FCM) clustering algorithm [2], which is quite applicable in daily life [3, 19]. In view of clustering the hesitant fuzzy information in practical life, many hesitant fuzzy clustering algorithms have been developed, for example, the minimal spanning tree clustering analysis method under hesitant fuzzy environment [30], hesitant fuzzy agglomerative hierarchical clustering algorithms [31] and hierarchical hesitant fuzzy K-means clustering algorithm [7]. The hesitant fuzzy distance measures have played an important role in clustering the hesitant fuzzy information.

As mentioned above, although the existing hesitant fuzzy distance measures are very useful, they cannot utilize the preference relationships between different alternative options, that is to say, they can’t take full advantage of the background information of the alternatives. The relative influences of the alternatives do not affect the distances between them, in other words, the alternatives in the space of traditional hesitant fuzzy distance measures are isolated, which cannot affect each other by their superior or worse influences. To overcome this weakness, in this paper, we shall propose a novel psychological distance measure for HFSs, which is inspired by the idea of Berkowitsh et al.’s paper [1]. The alternatives in the space of psychological distance measure are connected to each other, relatively superior or weaker influences between the alternatives can strengthen or undermine the distances between them, in which case the distance measure fits the inner activities of people. Thus, it could be relatively more functional when applying the novel distance measure into some fields like clustering and decision analysis. In order to better construct the new hesitant fuzzy distance measure, two main innovations are developed in the paper. The first one is that we shall construct the distance vectors between the HFSs, endowing the distances between the HFEs with the directions of plus or minus by the comparison method of HFEs [25]. As for the second one, in order to guarantee that the final hesitant fuzzy psychological distance measure satisfies the axiomatic properties of the hesitant fuzzy distance measure, we shall present a balancing coefficient reasonably so that the proposed psychological distances for HFS are all between 0 and 1. All the work mentioned above ensures the application value and the rationality of the final distance measure.

The article is organized as follows: Sect. 2 introduces some basic concepts of HFS and the previous researches on psychological distance; Sect. 3 develops the new hesitant fuzzy psychological distance measure, applies it in clustering analysis and then compares it with the existing hesitant fuzzy Hamming distance; at last, the article ends with some concluding remarks in Sect. 5.

2 Preliminaries

In this section, several basic concepts should be prepared for introducing the new distance measure. In the following, we will recall the concepts of hesitant fuzzy set (HFS), the distance measures and similarity measures of HFSs and the previous work of the psychological distance measure.

2.1 Hesitant fuzzy set

In real life, when deciding the membership degree of an element to a given set, different people may have different judgments. For example, some people may choose 0.3, while some others may choose 0.5. In light of the subjective differences of the decision makers, Torra [22] presented the hesitant fuzzy set (HFS) which elaborately demonstrates the situation that the membership degree of an element to a set consists of several values between 0 and 1. The HFS can be very effective when the decision makers have not reached an agreement or they have hesitations in making decisions. For example, some decision makers are invited to estimate the degree of an alternative fits a given criterion. When estimating, some decision makers think that the alternative strongly satisfies the criterion and choose 0.8, while some other ones think that it is not so sure to decide and choose 0.5, and the rest ones may hold a pessimistic attitude and choose 0.3. These three groups of people cannot persuade each other to change their minds, then the membership degree can be demonstrated as the form of \( \{ 0.8,0.5, \, 0.3\} \), which is a hesitant fuzzy element (HFE) [25]. The form is rational because it contains all the information that the decision makers provided.

Definition 2.1

[22] If \( X \) is a fixed set, a hesitant fuzzy set (HFS) \( A \) on \( X \) is in terms of a function that maps the elements in \( X \) to a subset of \( [0,1] \).

To be easily understood, Xia and Xu [25] proposed the following mathematical symbol to express a HFS: \( A = \{ \left\langle {x,h_{A} (x)} \right\rangle \left| {x \in X} \right.\} \), where \( h_{A} (x) \) represents the membership degrees of the element \( x \) to the set \( A \), \( h_{A} (x) \) includes several different values between 0 and 1, Xia and Xu [25] defined \( h_{A} (x) \) as a hesitant fuzzy element (HFE), which is the basic unit of HFS. Torra [22] gave some operations of HFEs as follows: (1) \( h^{c} = \bigcup\nolimits_{\gamma \in h} {\left\{ {1 - \gamma } \right\}} \); (2) \( h_{1} \cup h_{2} \)\( = \bigcup\nolimits_{{\gamma_{1} \in h_{1} ,\gamma_{2} \in h_{2} }} {\hbox{max} \left\{ {\gamma_{1} ,\gamma_{2} } \right\}} \); (3) \( h_{1} \cap h_{2} = \bigcup\nolimits_{{\gamma_{1} \in h_{1} ,\gamma_{2} \in h_{2} }} {\hbox{min} \left\{ {\gamma_{1} ,\gamma_{2} } \right\}} \), where \( h \), \( h_{1} \) and \( h_{2} \) are three HFEs.

Definition 2.2

[22] Suppose that \( h \) is a HFE, then the envelop of \( h \) is defined as \( A_{env} (h) \), and the envelop can be represented by a mathematical symbol \( (h^{ - } ,1 - h^{ + } ) \), where \( h^{ - } \) and \( h^{ + } \) are respectively the HFE’s lower and upper bounds.

Additionally, Xia and Xu [25] defined several operations for HFEs: (1) \( h^{\lambda } = \cup_{\gamma \in h} \left\{ {\gamma^{\lambda } } \right\} \); (2) \( \lambda h = \cup_{\gamma \in h} \left\{ {1 - (1 - \gamma )^{\lambda } } \right\} \); (3) \( h_{1} \oplus h_{2} = \cup_{{\gamma_{1} \in h_{1} ,\gamma_{2} \in h_{2} }} \left\{ {\gamma_{1} + \gamma_{2} - \gamma_{1} \gamma_{2} } \right\} \); (4) \( h_{1} \otimes h_{2} = \cup_{{\gamma_{1} \in h_{1} ,\gamma_{2} \in h_{2} }} \left\{ {\gamma_{1} \gamma_{2} } \right\} \).

In order to demonstrate the main algorithm proposed in this paper, it is necessary to first illustrate the comparison method of HFEs [25].

For a HFE \( h \), the score of \( h \): \( s(h) = \frac{1}{{l_{h} }}\sum\nolimits_{\gamma \in h} \gamma \) is used to measure the magnitude relationship between HFEs, where \( l_{h} \) is the number of the elements in \( h \). For any two HFEs, \( h_{1} \) and \( h_{2} \), if \( s(h_{1} ) > s(h_{2} ) \), which means that \( h_{1} \) is superior to \( h_{2} \), denoted by \( h_{1} > h_{2} \). If \( s(h_{1} ) = s(h_{2} ) \), which means that \( h_{1} \) is indifferent to \( h_{2} \), denoted by \( m_{1} {{\sim}}m_{2} \).

2.2 The distance and similarity measures of HFSs

Firstly, some properties that the distance measures of HFSs should satisfy will be illustrated. Assume that \( X = \{ x_{1} ,x_{2} , \ldots ,x_{n} \} \) is a discrete universe, \( A \) and \( B \) are two HFSs on \( X \) respectively, \( d(A,B) \) is the distance measure between \( A \) and \( B \). Xu and Xia [27] provided several properties which \( d(A,B) \) should satisfy: (1) \( 0\le d(A,B) \le 1 \); (2) \( d(A,B) = 0 \), if and only if \( A = B \); (3) \( d(A,B) = d(B,A) \).

Suppose that \( A \) and \( B \) are two HFSs over \( X \), Xu and Xia [27] defined the similarity measure between \( A \) and \( B \) as \( \bar{s}(A,B) \), and pointed out that the similarity measure should satisfy several properties as follows: (1) \( 0 \le \bar{s}(A,B) \le 1 \); (2) \( \bar{s}(A,B) = 1 \) if and only if \( A = B \); (3) \( \bar{s}(A,B) = \bar{s}(B,A) \).

According to Xu and Xia [27], \( \bar{s}(A,B) = 1 - d(A,B) \). Therefore, if we develop a distance measure based on HFSs, we can easily gain the corresponding similarity measure.

Each of \( x_{i} (i = 1,2, \ldots ,n) \) on the universe \( X \) can have a unique importance degree. \( w = \left( {w_{1} ,w_{2} , \ldots ,w_{n} } \right)^{T} \) is defined as the weight vector of those elements. The weight vector should satisfy: \( w_{i} \ge 0 \), and \( \sum\nolimits_{i = 1}^{n} {w_{i} } = 1 \), for any \( i = 1,2, \ldots ,n \). For two HFSs \( A \) and \( B \), Xu and Xia [27] defined several distance formulas based on the traditional Hamming distance and Euclidean distance:

  1. 1.

    The generalized hesitant weighted distance:

    $$ d_{1} (A,B) = \left[ {\sum\limits_{i = 1}^{n} {w_{i} \left( {\frac{1}{{l_{{x_{i} }} }}\sum\limits_{j = 1}^{{l_{{x_{i} }} }} {\left| {h_{A}^{\sigma (j)} (x_{i} ) - h_{B}^{\sigma (j)} (x_{i} )} \right|^{\lambda } } } \right)} } \right]^{{{1 \mathord{\left/ {\vphantom {1 \lambda }} \right. \kern-0pt} \lambda }}} $$
    (1)

    where \( h_{A}^{\sigma (j)} (x_{i} ) \) and \( h_{B}^{\sigma (j)} (x_{i} ) \) are respectively the \( j \) th largest values in \( h_{A} (x_{i} ) \) and \( h_{B} (x_{i} ) \).

    When \( \lambda = 1,2 \), Eq. (1) can respectively reduce to the hesitant weighted Hamming distance and the hesitant weighted Euclidean distance.

  2. 2.

    Hesitant fuzzy weighted Hamming distance:

    $$ d_{2} (A,B) = \sum\limits_{i = 1}^{n} {w_{i} \left[ {\frac{1}{{l_{{x_{i} }} }}\sum\limits_{j = 1}^{{l_{{x_{i} }} }} {\left| {h_{A}^{\sigma (j)} (x_{i} ) - h_{B}^{\sigma (j)} (x_{i} )} \right|} } \right]} $$
    (2)

    where \( h_{A}^{\sigma (j)} (x_{i} ) \) and \( h_{B}^{\sigma (j)} (x_{i} ) \) are respectively the \( j \)th largest values in \( h_{A} (x_{i} ) \) and \( h_{B} (x_{i} ) \).

  3. 3.

    Hesitant fuzzy weighted Euclidean distance:

    $$ d_{3} (A,B) = \left[ {\sum\limits_{i = 1}^{n} {w_{i} \left( {\frac{1}{{l_{{x_{i} }} }}\sum\limits_{j = 1}^{{l_{{x_{i} }} }} {\left| {h_{A}^{\sigma (j)} (x_{i} ) - h_{B}^{\sigma (j)} (x_{i} )} \right|^{2} } } \right)} } \right]^{{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}}} $$
    (3)

    where \( h_{A}^{\sigma (j)} (x_{i} ) \) and \( h_{B}^{\sigma (j)} (x_{i} ) \) are respectively the \( j \) th largest values in \( h_{A} (x_{i} ) \) and \( h_{B} (x_{i} ) \).

    Particularly, when \( w = \left( {\frac{1}{n},\frac{1}{n}, \ldots ,\frac{1}{n}} \right)^{T} \), Eqs. (2) and (3) can respectively reduce to the normalized hesitant fuzzy weighted Hamming distance and the normalized hesitant fuzzy weighted Euclidean distance:

  4. 4.

    Normalized hesitant fuzzy weighted Hamming distance:

    $$ d_{4} (A,B) = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left[ {\frac{1}{{l_{{x_{i} }} }}\sum\limits_{j = 1}^{{l_{{x_{i} }} }} {\left| {h_{A}^{\sigma (j)} (x_{i} ) - h_{B}^{\sigma (j)} (x_{i} )} \right|} } \right]} $$
    (4)
  5. 5.

    Normalized hesitant fuzzy weighted Euclidean distance:

    $$ d_{5} (A,B) = \left[ {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {\frac{1}{{l_{{x_{i} }} }}\sum\limits_{j = 1}^{{l_{{x_{i} }} }} {\left| {h_{A}^{\sigma (j)} (x_{i} ) - h_{B}^{\sigma (j)} (x_{i} )} \right|^{2} } } \right)} } \right]^{{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}}} $$
    (5)

2.3 Previous work about psychological distance

Berkowitsch et al. [1] proposed a generalized psychological distance measure for real number field. The new distance measure takes not only the decision maker’s preferences of alternative options into account but also the subjective importance for all the attributes. Considering the indifference vectors and the dominance vector, it is necessary to give a higher weight to the dominance vector relative to the indifference vectors for the dominance vector’s higher influence in making decisions.

The first step of calculating the distance is to assign values to all the attributes of all the alternative options. In order to precisely compare those values, standardizing all the attribute values is essential to guarantee that all the values have the same range. Then the directions and lengths of the indifference vectors and the dominance vectors should be calculated. Since the dominance vector is orthogonal to all the indifference vectors, we only need to calculate the directions of all the indifference vectors. Indifference vectors contain a kind of information called “exchange ratios”, which show that how many units that are worth for an attribute to be given up and added to another attribute by one unit. Directions of the indifference vectors depend on those exchange ratios. Berkowitsch et al. [1] pointed out that the number of indifference vectors lies on the number of attributes, so does the number of the exchange ratios. Since there is only one dominance vector no matter how many attributes, and the dominance vector is necessary to be given a higher weight than those indifference vectors, that is, a parameter bigger than 1 is acquired to the dominance direction.

Then, the distance between every two options should be expressed by distances in the dominance direction and distances in the indifference directions. In the standard attribute plane, the line between two options can be expressed as a distance vector. Next, the distance vector should be expressed by the dominance vector and the indifference vectors. In order to achieve this, a change of basis is necessary.

The last step is to calculate the final distance. To realize this, the distance in the dominance direction should be multiplied with a weight to boost its influence in the final result after calculating the Euclidean distance of the transformed distance vector. The weight should be equal or bigger than 1, which depends on the actual impact of the dominance direction. If the weight is equal to 1, the final distance will reduce to the distance that does not take the impact of the dominance direction into consideration.

It is worthy of pointing out that Berkowitsch’s et al. [1] method is only suitable for real numbers. Inspired by the idea of Berkowitsch’s et al. [1] distance measure, in the next section, we shall construct a new psychological distance under the hesitant fuzzy environment, to extend the advantage of the Berkowitsch’s et al. [1] method to the hesitant fuzzy distance measure.

3 Psychological distance for hesitant fuzzy information

Past researches for distance measures of hesitant fuzzy information have been providing some good properties that indicate how a distance measure between any two HFSs should be proposed. For example, Distance formulas for HFSs based on the traditional Hamming distance and Euclidean distance [27]. However, one thing that the past researches on distance measure for HFS fail to consider is the subjective preferential relationship between the alternatives. Motivated by the idea of the Berkowitsch’s et al. [1] method, in the following, we shall construct a novel distance measure for hesitant fuzzy information to solve this problem.

3.1 The psychological distance for HFSs

Firstly, assume that there are \( n \) options, and \( B = \{ B_{1} ,B_{2} , \ldots B_{n} \} \) is the set of those options. \( A = \{ A_{1} ,A_{2} \ldots A_{m} \} \) is a set of \( m \) attributes to comprehensively measure those options. To better estimate the actual values of \( m \) attributes for each option, suppose that several experts are invited as decision makers. For any attribute of any option, the decision makers may have different judgments about what scores need to be presented, and thus, every attribute of every option can obtain more than one value. In order to better calculate, it is necessary to scale all those scores to satisfy the standard of HFS, which means that all the scores should be between 0 and 1. Once they satisfy this standard, every attribute of every option becomes a HFE. Then, since those attributes are not equally important, which means that some attributes have relatively higher significance than the others, each attribute should be given a unique weight to show its importance. We define \( w_{i} (i = 1,2, \ldots ,m) \) as the corresponding \( m \) weights of those \( m \) attributes, where \( 0 \le w_{i} \le 1 \), for all \( i = 1,2, \ldots ,m \), and they satisfy: \( \sum\nolimits_{i = 1}^{m} {w_{i} } = 1 \). For those \( n \) options, each of which is a HFS. In order to calculate the psychological distance between HFSs, we just need to calculate the psychological distance between those options.

Next, in order to have the exchange ratios, we compare each attribute with one of themselves, normally we choose the first attribute. Since the weights of those attributes are already known, we first calculate each indifference vector, these \( m \)-dimensional indifference vectors show the directions on which options are competitive to each other:

$$ v_{j} = \left( { - \frac{{w_{j + 1} }}{{w_{1} }},0, \ldots ,0,\frac{{w_{1} }}{{w_{1} }},0, \ldots ,0} \right)^{T} = \left( { - \frac{{w_{j + 1} }}{{w_{1} }},0, \ldots ,0,1,0, \ldots ,0} \right)^{T} ,\;j = 1,2, \ldots ,m - 1 $$
(6)

where 1 is the \( \left( {j + 1} \right) \) th component.

Apparently, there are \( m - 1 \) indifference vectors, and the exchange ratio is just the two non-zero entries. The exchange ratio in \( j \)th indifference vector represents the number of units that the \( \left( {j + 1} \right) \)th attribute obtains in exchange of giving up on one unit from the first attribute. Next, we propose the dominance vector, which shows the dominance relationship between the options on the dominance direction. It can be easily deprived, since the dominance vector is orthogonal to all the \( m - 1 \) indifference vectors: \( d \cdot v_{j} = 0 \), for any \( j = 1,2, \ldots ,m - 1 \), and thus, the dominance vector is:

$$ d = \left( {\frac{{w_{1} }}{{w_{1} }},\frac{{w_{2} }}{{w_{1} }}, \ldots ,\frac{{w_{m} }}{{w_{1} }}} \right)^{T} $$
(7)

Then, the \( n \times n \) matrix \( B^{*} \), which contains those \( n - 1 \) indifference vectors \( v_{1} ,v_{2} , \ldots ,v_{m - 1} \) and the dominance vector \( d \), can be constructed as \( B^{*} = [v_{1} ,v_{2} , \ldots v_{m - 1} ,d] \).

The next step is to standardize all the indifference vectors and the dominance vector so that the lengths of all the vectors can be kept as 1. Then we have:

$$ B = \left[ {\frac{{v_{1} }}{{\left\| {v_{1} } \right\|}},\frac{{v_{2} }}{{\left\| {v_{2} } \right\|}}, \ldots ,\frac{{v_{m - 1} }}{{\left\| {v_{m - 1} } \right\|}},\frac{d}{\left\| d \right\|}} \right] $$
(8)

where \( \left\| {v_{j} } \right\| \) (for any \( j = 1,2, \ldots ,m - 1 \)) and \( \left\| d \right\| \) are the Euclidean lengths of \( v_{j} \) and \( d \). Obviously, all the \( m - 1 \) indifference vectors and the dominance vector have been standardized in the matrix \( B \). \( B \) is actually a matrix that is composed of \( m \) unit vectors.

According to the \( n \) options \( B_{i} (i = 1,2, \ldots ,n) \), each option is a HFS, and each attribute of each option is a HFE. Since all the HFSs have the same number of attributes in our discussion, based on Berkowitsch’s method [1], we should calculate the distances between the corresponding HFEs of those options. In order to have a better result, here we define the concept of distance vector for HFSs, which we label as \( dist_{\text{A}} \).

When calculating the distance vector \( dist_{A} \) between two HFSs, the distances between every two corresponding HFEs should be calculated first. Considering the direction from one HFS to the other, every distance between HFEs can be plus or minus, which is determined by the comparison method for HFEs [25]. If the direction is from one HFE to another greater one, then the distance is plus; if the direction is from one HFE to another lesser one, the distance is minus. For example, there are two HFSs:

$$ S_{1} = \left\{ \begin{aligned} \left\langle {x_{1} ,(0.1,0.2,0.1)} \right\rangle \hfill \\ \left\langle {x_{2} ,(0.5,0.6,0.8)} \right\rangle \hfill \\ \left\langle {x_{3} ,(0.6,0.9,0.9)} \right\rangle \hfill \\ \end{aligned} \right\},\quad S_{2} = \left\{ \begin{aligned} \left\langle {x_{1} ,(0.1,0.4,0.2)} \right\rangle \hfill \\ \left\langle {x_{2} ,(0.3,0.6,0.5)} \right\rangle \hfill \\ \left\langle {x_{3} ,(0.6,0.7,0.8)} \right\rangle \hfill \\ \end{aligned} \right\} $$

based on the comparison method, there are magnitude relationships as follows:

$$ \begin{aligned} \left\langle {x_{1} ,(0.1,0.2,0.1)} \right\rangle & < \left\langle {x_{1} ,(0.1,0.4,0.2)} \right\rangle ,\left\langle {x_{2} ,(0.5,0.6,0.8)} \right\rangle > \left\langle {x_{2} ,(0.3,0.6,0.5)} \right\rangle \\ & \quad \left\langle {x_{3} ,(0.6,0.9,0.9)} \right\rangle > \left\langle {x_{3} ,(0.6,0.7,0.8)} \right\rangle \\ \end{aligned} $$

and \( d_{i} (i = 1,2,3) \) separately express the hesitant fuzzy Hamming distances of those three pairs of HFEs: \( d_{1} = 0.1,\;\;d_{2} = 0.167, \) and \( d_{3} = 0.1 \). Hence, we get \( dist_{A} (S_{1} S_{2} ) = \left( {0.1,\; - 0.167,\; - 0.1} \right)^{T} \).

After we have the distance vector \( dist_{A} \) from one option to another, the next step is to transform this distance vector into a new distance vector \( dist_{B} \). To achieve this, a change of basis is necessary. Here we use the basis \( B \) to realize this:

$$ dist_{B} = B^{ - 1} \cdot dist_{A} $$
(9)

Just as \( dist_{A} \), \( dist_{B} \) is a distance vector whose components are expressed by the ones of the indifference vector and the dominance vector. Moreover, \( dist_{B} \) is the important vector to construct the hesitant fuzzy psychological distance measure.

The next step is to present a distance measure, which is a key component of the final distance formula. The main work of this step is endowing the distance in the dominance direction with a relatively higher weight, which guarantees that the dominance direction has a higher impact than those indifference vectors.

$$ D_{t} = \sqrt {dist_{B}^{\prime } \cdot W \cdot dist_{B} } $$
(10)

where \( D_{t} \) is the preparatory distance measure for options. \( W \) is a \( m \times m \) matrix as follows:

$$ W = \left[ {\begin{array}{*{20}c} 1 & {} & {} & {} \\ {} & 1 & {} & {} \\ {} & {} & \ddots & {} \\ {} & {} & {} & w \\ \end{array} } \right]_{m \times m} $$
(11)

where \( w \) satisfies \( w > 1 \), and the greater \( w \) we choose, the more weight is given to the distance in the dominance direction, which means that the weight of the distance in the dominance direction is \( w \) times larger than the distances in the indifference directions. By introducing the weight \( w \), the importance of the dominance direction can be shown.

Since the distances between HFSs need to be less than or equal to 1, then a balancing coefficient \( c \) is proposed to ensure that the psychological distances meet the criterion. The final psychological distance formula is as follows:

$$ D_{f} = \frac{{D_{t} }}{c} $$
(12)

where \( D_{f} \) is the final psychological distance, and the balancing coefficient \( c \) is set to be the maximum value of \( D_{t} \).

Considering the maximum value of \( D_{t} \), for every attribute, it is obvious that two HFEs \( \{ 0,0, \ldots , 0 \} \) and \( \{ 1,1, \ldots ,1\} \) (the numbers of 0 and 1 depend on the number of the decision makers) have the longest distance, because in this condition, the two HFEs are extreme, all the decision makers choose the minimum value for one option, while they all choose the maximum value for the other, which means that the first option is impossible while the other one is certain. The same goes for HFSs. The following two HFSs:

$$ \begin{aligned} B_{x} & = \{ \left\langle {A_{1} ,(0,0, \ldots ,0)} \right\rangle ,\left\langle {A_{2} ,(0,0, \ldots ,0)} \right\rangle , \ldots ,\left\langle {A_{m} ,(0,0, \ldots ,0)} \right\rangle \} \\ B_{y} & = \{ \left\langle {A_{1} ,(1,1, \ldots ,1)} \right\rangle ,\left\langle {A_{2} ,(1,1, \ldots ,1)} \right\rangle , \ldots ,\left\langle {A_{m} ,(1,1, \ldots ,1)} \right\rangle \} \\ \end{aligned} $$

have the longest distance (the traditional distance and the psychological distance), because they are two opposite extremes in every attribute. When considering psychological distance, it is necessary to distribute different attributes with different weights to comprehensively estimate their influences, and the distance in the dominance direction is guaranteed to be weighted greater than those distances in the indifference directions. However, the two HFSs above have the biggest gap in every attribute, which means that the psychological distance must be the longest no matter how the weights are distributed. Hence, \( c = D_{t} (B_{x} ,B_{y} ) \). Clearly, the psychological distance for HFSs satisfies \( 0 \le D_{f} \le 1 \). For two HFSs, \( M \) and \( N \), if \( M \equiv N \), then \( dist_{A} (MN) = 0 \) and \( dist_{B} (MN) = 0 \), thus \( D_{f} (MN) = 0 \); Oppositely, if \( D_{f} (MN) = 0 \), then \( D_{t} (MN) = 0 \), obviously, \( dist_{A} (MN) = 0 \), so \( M \equiv N \). We notice that \( dist_{A} (MN) = - dist_{A} (NM) \), and \( dist_{B} (MN) = - dist_{B} (NM) \), while it is easy to prove the following result:

$$ dist_{B} (MN)^{\prime } \cdot W \cdot dist_{B} (MN) = dist_{B} (NM)^{\prime } \cdot W \cdot dist_{B} (NM) $$
(13)

hence, \( D_{t} (MN) = D_{t} (NM) \) and \( D_{f} (MN) = D_{f} (NM) \).

As shown above, the proposed distance measure can well satisfy the three properties for distance measures of HFSs illustrated earlier.

To better understand the difference between the proposed hesitant fuzzy psychological distance and the existing hesitant fuzzy Hamming distance, we shall give some geometric interpretation as follows.

3.2 Graphical comparison between the hesitant fuzzy psychological distance measure and traditional hesitant fuzzy distance measures

As shown in Fig. 1, a comparison is made to further show the superiority and the effectiveness of the proposed distance measure. Figure 1a shows the condition of one of the previous hesitant fuzzy distance—the hesitant fuzzy Hamming distance [27], which can represent most of the conditions of the traditional hesitant fuzzy distances, such as the normalized Hamming distance, the Euclidean distance and the normalized Euclidean distance, while Fig. 1b shows the condition of the proposed hesitant fuzzy psychological distance space. There are three options, \( A \), \( B \) and \( C \), which separately represent three kinds of new products, and the two axes, \( x \) and \( y \) represent two attributes describing the performances of the three options, the values of \( x \) and \( y \) are between 0 and 1, and the greater the values are, the better the performances of the products will be. Experts are invited to estimate the performances of \( A \), \( B \) and \( C \) with respect to the attributes \( x \) and \( y \), and every attribute of each option has several estimation results proposed by several experts. Thus, \( A \), \( B \) and \( C \) are three HFSs separately. Apparently, the hesitant fuzzy Hamming distance (or the hesitant fuzzy Euclidean distance) between the options \( A \) and \( B \) and the same distance measure between the options \( B \) and \( C \) in (a) are exactly the same. However, for the three options above, the psychological distance between the options \( A \) and \( B \) and the psychological distance between the options \( B \) and \( C \) in (b) are different: The psychological distance between \( B \) and \( C \) is longer than the psychological distance between \( A \) and \( B \), which is because \( A \) and \( B \) are dominated by \( C \), which means that \( A \) and \( B \) have relatively the same level of performances, while \( C \) is highly superior. When comparing \( A \) and \( B \), the loss in \( x \) (\( y \)) can be compensated by the redundant of \( y \) (\( x \)), and thus, they are comparable and a replacement from one to the other is acceptable. However, a replacement from \( C \) to \( A \) or \( C \) to \( B \) is not acceptable for the experts, because it will lead to a loss in both attributes. Apparently, the experts may have preferences for \( C \) over \( A \) and \( B \). Seen from above, in (b), the options \( A \) and \( B \) are highly comparable to each other while they are both dominated by the option \( C \). So the line between \( A \) and \( B \) is the indifference line, and the line between \( B \) and \( C \) is the dominance line, in which the psychological space is stretched; In comparison, the distance between \( A \) and \( B \) in (b) is equal to the distance between \( A \) and \( B \) in (a).

Fig. 1
figure 1

Three options in the hesitant fuzzy Hamming distance condition (a) and the hesitant fuzzy psychological distance condition (b)

Clearly, \( A \) and \( B \) have a lot in common, while \( C \) differs a lot from \( A \) and \( B \). Hence, it is reasonable that the psychological distance between \( A \) and \( B \) is shorter than the distance between either of them and \( C \).

In the following, several steps for the algorithm procedure of hesitant fuzzy psychological distance are summarized.

3.3 Hesitant fuzzy psychological distance procedure

  • Step 1 After several decision makers estimate all the values of the attributes for those options, all the scores that the experts give should be scaled to satisfy the standard of HFS. Besides, all the weights of attributes should be determined reasonably.

  • Step 2 According to the formulas (6) and (7), we calculate the \( n - 1 \) indifference vectors and the dominance vector to get the matrix \( B^{*} = [v_{1} ,v_{2} , \ldots v_{m - 1} ,d] \).

  • Step 3 By standardizing all those vectors as the formula (8), the matrix \( B \) can be calculated, in which all the lengths of vectors can be kept as 1.

  • Step 4 Calculate \( dist_{A} \) between the options, and further calculate \( dist_{B} \) by a change of basis according to the formula (9).

  • Step 5 Calculate \( D_{t} \) by the formula (10), and further determine \( c \).

  • Step 6 Calculate the final psychological distance \( D_{f} \) by the formula (12).

In order to simply demonstrate the whole process of calculating the psychological distance, here we give the following flow chart (Fig. 2):

Fig. 2
figure 2

The process flow chart of calculating the psychological distance

4 Application and comparison

In this part, two examples will be illustrated in order to show the computing process and application of the proposed hesitant fuzzy psychological distance:

Computing process Suppose that an investigation report about nutritive value of fruits need to be finished. Three kinds of fruits \( A \), \( B \) and \( C \) are going to be tested. Since an objective and reasonable investigation result is necessary, then several experts are invited to estimate three main attributes: (1) the index of anti-aging; (2) the index of skin caring; and (3) the index of lowing blood pressure. Scores are between 0 and 1, where 0 represents the lowest score, and 1 represents the highest score. Since the expertise of those experts differs, then every expert only estimates a part of the attributes.

We can see from Table 1 that \( A \), \( B \) and \( C \) are three HFSs, and each of them is made up of three HFEs. However, the numbers of values in those HFEs are different. To solve this problem in calculating distances, according to the regulations given by Xu and Xia [27], the solution is to extend the shorter one so that all the HFEs from the same attribute can have the same number of values. For each HFE, the extending method is adding one of its existed value, and which one to be chosen depends on the risk preference of those experts. Once these experts are pessimists, it is better to add the minimal value, while if they are optimists, adding the maximum one will be appropriate. Here assume that these experts are all pessimists, we extend the shorter HFEs by adding the minimal ones, the results are listed in Table 2.

Table 1 Evaluation results given by the experts
Table 2 Modified evaluation results

Considering the importance of each attribute, the experts decide to distribute their subjective importance weights as shown in Table 3.

Table 3 Distribution of subjective importance weights from the experts

From the above estimation result, it is clear that \( B \) and \( C \) are comprehensively superior to \( A \), while it is hard to choose between \( B \) and \( C \) because they are highly competitive: \( B \) has a high score in the index of anti-aging while \( C \) has a high score in the index of lowing blood pressure. Thus, it is appropriate to assume that the experts are indifferent between \( B \) and \( C \), both \( B \) and \( C \) dominate the option \( A \).

The next step is calculating the indifference vectors and the dominance vector by the known weights. The two 3-dimensional indifference vectors are \( iv_{ 1} = \left( { - 0. 5,\; 1,\; 0} \right)^{T} \) and \( iv_{ 2} = \)\( \left( { - 1,0,1} \right)^{T} \), the dominance vector is \( dv = \left( {1,0.5,1} \right)^{T} \), and thus, we get the matrix \( B^{*} \):

$$ B^{*} = \left( {\begin{array}{*{20}c} { - 0. 5} & { - 1} & 1 \\ 1 & 0 & { 0. 5} \\ 0 & 1 & 1 \\ \end{array} } \right) $$

Then the next step is to calculate the matrix \( B \) by standardizing the indifference vectors and the dominance vector so that their lengths can be kept as 1. Since the lengths \( l_{{iv_{1} }} = \sqrt {1.25} \), \( l_{{iv_{2} }} = \sqrt 2 \) and \( l_{dv} = \sqrt {2.25} = 1.5 \), then we get

$$ B = \left( {\begin{array}{*{20}c} { - \frac{0.5}{{\sqrt {1.25} }}} & { - \frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ {\frac{1}{{\sqrt {1.25} }}} & 0 & {\frac{1}{3}} \\ 0 & {\frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ \end{array} } \right) $$

Next, we should calculate the distance vectors between the three options based on the method mentioned before. Firstly, there are magnitude relationships as follows:

  1. 1.

    The index of anti-aging: \( A(0.3,0.4,0.3,0.2) < C(0.5,0.4,0.3,0.3) < B(0.7,0.8,0.6,0.6) \);

  2. 2.

    The index of skin caring: \( B(0.5,0.1,0.2,0.1) < C(0.2,0.3,0.2,0.3) < A(0.2,0.2,0.3,0.4) \);

  3. 3.

    The index of lowing blood pressure: \( A(0.3,0.4,0.4,0.3) < B(0.5,0.3,0.5,0.4) < C(0.6,0.7,0.8,0.6) \).

and the hesitant fuzzy Hamming distances of those HFEs are as follows:

  1. 1.

    The index of anti-aging: \( d_{AB} = 0.375 \), \( d_{AC} = 0.075 \), and \( d_{BC} = 0.3 \);

  2. 2.

    The index of skin caring: \( d_{AB} = 0.2 \), \( d_{AC} = 0.075 \), \( d_{BC} = 0.175 \);

  3. 3.

    The index of lowing blood pressure: \( d_{AB} = 0.125 \), \( d_{AC} = 0.4 \), \( d_{BC} = 0.25 \).

Then, the distance vectors can be calculated through the results above:

$$ dist_{A} (AB) = \left( {0.375, - 0.2,0.125} \right)^{T} ,\quad dist_{A} (AC) = \left( {0.075, - 0.075,0.4} \right)^{T} ,\quad dist_{A} (BC) = \left( { - 0.3,0.175,0.25} \right)^{T} $$

To calculate the distance vector \( dist_{B} \) that is expressed by the indifference vectors and the dominance vector, we need to conduct a change of basis according to the formula (9):

$$ \begin{aligned} & dist_{B} (AB) = B^{ - 1} \cdot dist_{A} (AB) = \left( {\begin{array}{*{20}c} { - \frac{0.5}{{\sqrt {1.25} }}} & { - \frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ {\frac{1}{{\sqrt {1.25} }}} & 0 & {\frac{1}{3}} \\ 0 & {\frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ \end{array} } \right)^{ - 1} \cdot \left( \begin{aligned} 0.375 \\ - 0.2 \\ 0.125 \\ \end{aligned} \right) = \left( \begin{aligned} 0.4152 \\ - 0.308 \\ 0.5142 \\ \end{aligned} \right) \\ & dist_{B} (AC) = B^{ - 1} \cdot dist_{A} (AC) = \left( {\begin{array}{*{20}c} { - \frac{0.5}{{\sqrt {1.25} }}} & { - \frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ {\frac{1}{{\sqrt {1.25} }}} & 0 & {\frac{1}{3}} \\ 0 & {\frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ \end{array} } \right)^{ - 1} \cdot \left( \begin{aligned} 0.075 \\ - 0.075 \\ 0.4 \\ \end{aligned} \right) = \left( \begin{aligned} 0.2475 \hfill \\ 0.1516 \hfill \\ 0.4392 \hfill \\ \end{aligned} \right) \\ & dist_{B} (BC) = B^{ - 1} \cdot dist_{A} (BC) = \left( {\begin{array}{*{20}c} { - \frac{0.5}{{\sqrt {1.25} }}} & { - \frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ {\frac{1}{{\sqrt {1.25} }}} & 0 & {\frac{1}{3}} \\ 0 & {\frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ \end{array} } \right)^{ - 1} \cdot \left( \begin{aligned} - 0.3 \\ 0.175 \\ 0.25 \\ \end{aligned} \right) = \left( \begin{aligned} - 0.2396 \\ 0.4646 \\ - 0.1178 \\ \end{aligned} \right) \\ \end{aligned} $$

The first distance vector \( dist_{B} (AB) \) shows the fact that to move from the option \( A \) to the option \( B \), we need to walk 0.4152 unit along the first indifference vector, 0.308 unit along the opposite direction of the second indifference vector, and 0.5142 unit along the dominance direction. The analogical results for \( dist_{B} (AC) \) and \( dist_{B} (BC) \) can also be speculated.

Considering the importance of the dominance direction, \( w \) is set as 5, which means that the dominance vector is 5 times more important than any of those indifference vectors. Then, the matrix \( W \) is:

$$ W = \left( {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \\ \end{array} } \right) $$

Based on the formula (10), the preparatory distance measure \( D_{t} \) between \( AB \), \( AC \) and \( BC \) can be calculated:

$$ \begin{aligned} D_{t} \left( {AB} \right) & = \sqrt {dist_{B} (AB)^{\prime } \cdot W \cdot dist_{B} (AB)} \\ & = \sqrt {\left( {\begin{array}{*{20}c} {0.4152} & { - 0.308} & {0.5142} \\ \end{array} } \right) \cdot \left( {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \\ \end{array} } \right) \cdot \left( \begin{aligned} 0.4152 \\ - 0.308 \\ 0.5142 \\ \end{aligned} \right)} \\ & = 1.5893 \\ D_{t} \left( {AC} \right) & = \sqrt {dist_{B} (AC)^{\prime } \cdot W \cdot dist_{B} (AC)} \\ & = \sqrt {\left( {\begin{array}{*{20}c} {0.2475} & {0.1516} & {0.4392} \\ \end{array} } \right) \cdot \left( {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \\ \end{array} } \right) \cdot \left( \begin{aligned} 0.2475 \\ 0.1516 \\ 0.4392 \\ \end{aligned} \right)} \\ & = 1.0487 \\ D_{t} (BC) & = \sqrt {dist_{B} (BC)^{\prime } \cdot W \cdot dist_{B} (BC)} \\ & = \sqrt {\left( {\begin{array}{*{20}c} { - 0.2396} & {0.4646} & { - 0.1178} \\ \end{array} } \right) \cdot \left( {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \\ \end{array} } \right) \cdot \left( \begin{aligned} - 0.2396 \\ 0.4646 \\ - 0.1178 \\ \end{aligned} \right)} \\ & = 0.3426 \\ \end{aligned} $$

In order to get the final psychological distance \( D_{f} \) between the options, the balancing coefficient \( c \) needs to be calculated first. To achieve this, here two HFSs \( E \) and \( F \) are constructed as follows:

$$ E = \left\{ \begin{aligned} (0,0,0,0) \hfill \\ (0,0,0,0) \hfill \\ (0,0,0,0) \hfill \\ \end{aligned} \right\},\quad F = \left\{ \begin{aligned} \hfill (1,1,1,1) \\ \hfill (1,1,1,1) \\ \hfill (1,1,1,1) \\ \end{aligned} \right\} $$

According to Sect. 3.2, we get \( c = D_{t} (E,F) \). To calculate \( c \), we can easily obtain:

$$ dist_{A} (EF) = \left( {1,1,1} \right)^{T} $$

and thus, based on the formula (9), we have

$$ dist_{B} (EF) = B^{ - 1} \cdot dist_{A} (EF) = \left( {\begin{array}{*{20}c} { - \frac{0.5}{{\sqrt {1.25} }}} & { - \frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ {\frac{1}{{\sqrt {1.25} }}} & 0 & {\frac{1}{3}} \\ 0 & {\frac{1}{\sqrt 2 }} & {\frac{2}{3}} \\ \end{array} } \right)^{ - 1} \cdot \left( \begin{aligned} 1 \\ 1 \\ 1 \\ \end{aligned} \right) = \left( \begin{aligned} - 0.6388 \\ 0.202 \\ 1.2857 \\ \end{aligned} \right) $$

Then, \( D_{t} (EF) \) can be further calculated:

$$ \begin{aligned} c & = D_{t} \left( {EF} \right) = \sqrt {dist_{B} (EF)^{\prime } \cdot W \cdot dist_{B} (EF)} \\ & = \sqrt {\left( {\begin{array}{*{20}c} { - 0.6388} & {0.202} & {1.2857} \\ \end{array} } \right) \cdot \left( {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \\ \end{array} } \right) \cdot \left( \begin{aligned} - 0.6388 \\ 0.202 \\ 1.2857 \\ \end{aligned} \right)} \\ & = 8.714 \\ \end{aligned} $$

On account of the results above, and according to the formula (12), the final psychological distances between \( AB \), \( AC \) and \( BC \) can be easily calculated:

$$ D_{f} \left( {AB} \right) = \frac{{D_{t} (AB)}}{c} = 0. 1 8 2 4 ,\quad D_{f} \left( {AC} \right) = \frac{{D_{t} (AC)}}{c} = 0. 1 2 0 3,\quad D_{f} \left( {BC} \right) = \frac{{D_{t} (BC)}}{c} = 0. 0 3 9 3 $$

For comparison, the hesitant fuzzy weighted Hamming distances between \( AB \), \( AC \) and \( BC \) are respectively 0.24, 0.205 and 0.255. Seen from the discrepancy of the above results from the two distance measures, we can know that \( B \) and \( C \) are far more closer to each other than either of them to \( A \) according to the hesitant fuzzy psychological distance, while the hesitant fuzzy weighted Hamming distances between the three options are relatively close. It is because during the calculation process of the hesitant fuzzy psychological distance, we endow the dominance vector with a higher weight so that the differences between options that along the dominance direction are expanded. The result confirms our assumption that \( B \) and \( C \) are more indifferent to each other, while \( A \) is dominated by \( B \) and \( C \). Because that \( B \) and \( C \) are highly competitive and they are both superior to \( A \), it is reasonable to get those options like \( B \) and \( C \) that with similar performances relatively closer.

In the following, an application of the novel distance measure in clustering field is conducted to further show some practical significances:

Application According to the first example, in order to better take advantage of all the nutrition and provide persuasive suggestions to people, there are five kinds of fruits need to be clustered based on their levels of comprehensive nutritive value. We still invite those experts to estimate the performances of the five kinds of fruits according to the previous three attributes. The estimation results are listed in Table 4.

Table 4 Evaluation results

Here we apply the new distance measure into the hesitant fuzzy minimal spanning tree (HFMST) clustering algorithm [30] to see the practical performance of the new distance measure.

Firstly, we straightly calculate the hesitant fuzzy psychological distances of the five kinds of fruits, the results are listed in Table 5:

By combining Tables 4 and 5, we can easily see the relationships between the five options. According to the hesitant fuzzy minimal spanning tree clustering algorithm [30], the clustering result depending on the threshold \( \lambda \) is showed in Table 6:

Table 5 Psychological distances of the 5 options
Table 6 Clustering results of 5 options

We can see from Table 6 that the five kinds of fruits have been clustered by the hesitant fuzzy minimal spanning tree clustering algorithm based on the hesitant fuzzy psychological distance measure. Assume \( \lambda = 0.1 \), It is apparent that \( A \) and \( D \) are highly comparable while \( E \), \( B \) and \( C \) are highly comparable, \( E \), \( B \) and \( C \) are superior to \( A \) and \( D \). Since the options that are highly comparable are more inclined to be clustered together, the application of the proposed distance measure in clustering has its rationality.

Next we shall cluster the five alternative options in hesitant fuzzy Hamming distance [27].

Comparison analysis By comparing the two results with different distance measures, we can better analyze the properties of the proposed one. The hesitant fuzzy Hamming distances between the five kinds of fruits are as follows (Table 7).

Table 7 Hesitant fuzzy Hamming distances of the 5 options

Based on the hesitant fuzzy minimal spanning tree clustering algorithm [30], we can have the clustering result depending on the threshold \( \mu \) in Table 8:

Table 8 Clustering results in hesitant fuzzy Hamming distance

Comparing the two clustering results above, we can see that the two results have some similarities, especially in certain thresholds. However, the differences still exist. In the first clustering result, the option \( C \) is inclined to be clustered with \( B \) (see Table 6), while \( C \) tends to be clustered with \( A \) and \( D \) in the second clustering result (see Table 8). From Table 6, we can see that \( E \) is the best of all the five alternative options, while \( E \) is harder to be clustered with other options in the second clustering result than in the first one. The reason of all of these distinctions is that the proposed hesitant fuzzy psychological distance measure takes not only the practical distances but also the comprehensive performances estimated by the experts into account. For example, the options that have the same level of performances are likely to be clustered together, though their values in certain attributes may differ a lot. If a clustering method takes the proposed hesitant fuzzy psychological distance measure into use, it will prefer clustering those options that perform closely together, which is applicable in some circumstances. For instance, in the above example, with the novel hesitant fuzzy psychological distance measure, the options that have the same level of intuitive value tend to be clustered together which coincides with the practical circumstance.

5 Concluding remarks

In this paper, we have developed a psychological distance measure for HFSs. The proposed distance measure is based on the properties of HFSs and inspired by the innovation of Berkowitsch et al. [1] in real number field. The preferential relationships between HFSs have been considered and conducted by proposing two different directions: the indifference direction and the dominance direction; the vectors in the dominance direction have been given greater weights relative to the vectors in the indifference vectors on account of their different significances. Besides, the importance degrees of different attributes have also been considered. Moreover, the concept of the distance vector between HFSs has been constructed, and a balancing coefficient has also been proposed in order to make sure that the final hesitant fuzzy psychological distance measure satisfies the conditions of distance measures of HFSs. Compared to the traditional distance measures of HFSs in the multi-attribute space, the novel distance measure comprehensively takes people’s subjective preferences of options and individual variations of attributes into account, which can reflect the comprehensive performances of the alternative options in many fields, such as clustering. Two detailed examples have illustrated the computational processes and the application potentials of the hesitant fuzzy psychological distance measure.

Despite all the advantages, compared to the traditional distance measures for HFSs, our new distance measure still has some aspects that need to be improved in the future. The calculation of the new distance measure has a high computational complexity, which may cause an unsatisfying computational time when facing a large amount of data; some researches about the application of the new distance measure still need to be conducted. In order to optimize and improve the new distance measure, in the future work, we will focus on minimizing the computational complexity and applying the new distance measure into several other fields such as data mining, information retrieval, and so on.