Keywords

1 Introduction

The Sugeno integral is used in multicriteria decision making as a tool for guiding decision support [7, 12]. Sugeno integrals are qualitative aggregation operators that take as input some local evaluation of alternatives and output a global evaluation. In this paper we provide an elicitation protocol for the Sugeno integral.

The problem of eliciting some Sugeno integrals agreeing with a dataset has received some attention [6, 10,11,12] both from theoretical and practical point of view. The theoretical results concern the elicitation of a unique family of Sugeno integrals expected to be consistent with a set of data. Inconsistency is usually to be avoided; in [8] the data is partitioned in classes and a fuzzy integral is calculated for each one. In [10, 11] the aim is to identify the bounds of the set of the family of Sugeno integrals consistent with the data. If the dataset is not fully consistent with only one family of Sugeno integrals, they consider several ones; this point of view is motivated by the fact that the dataset may contain many classes of profiles.

Sugeno integrals are defined with capacities used in multicriteria decision making to represent the weights of subsets of criteria. Sugeno integrals are used in [10] to extract knowledge from experimental data in multi-factorial evaluation; a capacity associated to a Sugeno integral consistent with the dataset is calculated and then a set of rules corresponding to the capacity are derived. Some other work on eliciting or learning a capacity from examples are based on linear programming methods that minimize the total error [1].

The approach presented in this paper is different. We are proposing incremental elicitation and winner determination processes in which preference queries are selected one at a time. The elicitation is targeted towards the determination of the best choice among a set of alternatives.

The aim of this paper is to introduce an adaptive elicitation procedure in the context of the Sugeno integral for the fast determination of a necessary winner and to evaluate the practical efficiency of this procedure.

The elicitation continues until we have enough information about the capacity to identify the alternative associated with the highest Sugeno value; to do this we adopt the maximin criterion as proposed in a previous work in multi attribute decision making [15]. The proposed method bears strong similarity to the regret-based approach for eliciting a capacity for the Choquet integral [2]; the main difference lies in the fully ordinal setting under consideration in this paper, that makes regret a meaningless concept in our context.

The paper is organized as follows. In Sect. 2 we provide some background on the Sugeno integral and its use in multiple criteria decision making. We then describe our elicitation method in Sect. 3. In Sect. 4 we provide numerical tests to evaluate our approach; we conclude with final remarks in Sect. 5.

2 Background and Notation

Let X be a finite set of alternatives or objects that need to be compared in order to make a decision. An object \(x \in X\) is evaluated with respect to a set of n criteria \(\mathcal {C}=\{1, \cdots ,n \}\). An object is represented by a vector \((x_1, \cdots , x_n)\) where \(x_i\) represents the evaluation of x according to the criterion i. The criteria are evaluated on a common (finite) evaluation scale L. The global evaluation is also given on L. We assume that L is a bounded totally ordered finite set with a bottom denoted by \(0_L\), a top denoted by \(1_L\). Moreover L is equipped with an involutive negation denoted by \(1_L-\) which is an order reverse function.

2.1 Lattice of Capacities

A capacity (or fuzzy measure) v is a set function, defined over subsets of \(\mathcal {C}\), that is monotone with respect to set inclusion, i.e., \(v : 2^C \rightarrow L\) such that if \(A \subseteq B \subseteq \mathcal {C}\) then \(v(A) \le v(B)\), \(v(\emptyset )=0\) and \(v(C)=1\).

We denote by \(V_{\mathcal {C}}\) the set of all capacities on \(\mathcal {C}\), and we drop the subscript \(\mathcal {C}\) when it is clear from the context. A partial order \(\le \) between capacities is established as follows:

\(v^1 \le v^2\) whenever \(v^1(G) \le v^2(G)\) for all \(G \in 2^C\).

The pair \((V_{\mathcal {C}},\le )\) is a bounded lattice. It can also be identified by the tuple \((V_{\mathcal {C}},\wedge ,\vee ,\bot ,\top )\) where the binary operators \(\vee \) (join) and \(\wedge \) (meet), and the elements \(\bot \) and \(\top \) are established as follows:

  • given \(v^1, v^2 \in V_{\mathcal {C}}\), the capacity \(v^1 \wedge v^2\) is such that \((v^1 \wedge v^2)(G) = \min (v^1(G),v^2(G))\) for all \(G \in 2^C\);

  • given \(v^1,v^2 \in V_{\mathcal {C}}\), the capacity \(v^1 \vee v^2\) is such that \((v^1 \vee v^2)(G) = \max (v^1(G),v^2(G))\) for all \(G \in 2^C\);

  • \(\bot \) gives 0 to all proper subsets of \(\mathcal {C}\), \(\bot (G)=0\) for all \(G \subset \mathcal {C}\);

  • the \(\top \) element is the capacity that associates 1 to every non-empty subset, \(\top (G)=1\) for all \(G \subseteq \mathcal {C}\).

Considering two capacities \(\check{v}\), \(\hat{v}\), an interval \([\check{v},\hat{v}]_{V_\mathcal {C}}\) is the subset \(\{ v \in V_\mathcal {C} | \check{v} \le v \le \hat{v} \}\); the interval is nonempty if and only if \(\check{v} \le \hat{v}\). Note that a nonempty interval is a sublattice [5] i.e., the interval is closed with the infimum \(\wedge \) and the supremum \(\vee \).

Since the intersection of two sublattices is a sublattice, then the intersection of two intervals which is an interval is also a sublattice. The intersection of n intervals \([\check{v}^i,\hat{v}^i]_V\), with \(i=1,\ldots ,n\) is given by

$$\bigcap _{i=1,\ldots ,n} [\check{v}^i,\hat{v}^i]_V = \Big [\bigvee \check{v}^i, \bigwedge \hat{v}^i \Big ]_V.$$

It follows that the intersection of n intervals is not empty if and only if \(\bigvee \check{v}^i \le \bigwedge \hat{v}^i\) i.e for all \(i,j \in \{1, \cdots , n\}\), we have \(\check{v}^i \le \hat{v}^j\).

We provide a direct proof, using lattice theory, of a statement mentioned in the proof of Proposition 6 in the paper by Prade et al. [10].

Proposition 1

Assume n intervals \([\check{v}^i,\hat{v}^i]_V\), with \(i=1,\ldots ,n\) whose pairwise intersections \([\check{v}^i,\hat{v}^i]_V \cap [\check{v}^j,\hat{v}^j]_V\) for all \(i,j \in \{1,\ldots ,n \}\), are not empty. Then the intersection \(\bigcap _{i=1,\ldots ,n} [\check{v}^i,\hat{v}^i]_V \) is not empty.

Proof

The fact the pairwise intersections are not empty means that

$$[\check{v}^i,\hat{v}^i]_V \cap [\check{v}^j,\hat{v}^j]_V \ne \emptyset \;\; \forall i,j \iff \check{v}^i \vee \check{v}^j \le \hat{v}^i \wedge \hat{v}^j \;\; \forall i,j $$

Since, by definition of \(\vee \) and \(\wedge \), it holds \(\check{v}^i \le \check{v}^i \vee \check{v}^j\) and \(\hat{v}^i \wedge \hat{v}^j \le \hat{v}^j\), it also follows that \( \check{v}^i \le \hat{v}^j \;\; \forall i,j\). It follows that \(\bigvee _i \check{v}^i \le \bigwedge _i \hat{v}^i\) that exactly means \(\bigcap _{i=1,\ldots ,n} [\check{v}^i,\hat{v}^i]_V \) is not empty.

2.2 Discrete Sugeno Integral

We now review the definition of the Sugeno integral, as used in Multi Criteria Decision Analysis (MCDA) to aggregate into a single score the evaluation of an object with respect to several criteria.

Let \(\sigma \) be a permutation on \(\mathcal {C}\) such that \(x_{\sigma (1)} \le \ldots \le x_{\sigma (n)}\). The Sugeno integral [13] of an alternative x with respect to capacity v can be defined by means of several equivalent expressions:

$$\begin{aligned} S_v (x )= \max _{A\subseteq \mathcal {C}} \min (v(A), \min _{i\in A} x_i) = \min _{A\subseteq \mathcal {C}} \max (v(\overline{A}), \max _{i\in A} x_i), \end{aligned}$$
(1)

where \(\overline{A}\) is the complement of A. These expressions can be simplified as follows:

$$\begin{aligned} S_v (x) = \max _{\alpha \in L} \min ( v(\{i:x_i \ge \alpha \}), \alpha )= \min _{a \in L} \max (v(\{i:x_i > \alpha \}), \alpha ). \end{aligned}$$
(2)

A basic property of the Sugeno integral is that the result of the aggregation is between the minimum and the maximum component.

$$ \min _{i=1,\ldots ,n} x_i \le S_v(x_1,\ldots ,x_n) \le \max _{i=1,\ldots ,n} x_i $$

A direct consequence is that \(S_v(c,\ldots ,c)=c\) for any capacity v (idempotency or unanimity). Note also that the value of the Sugeno integral of an alternative x is monotone with respect to the order between capacities:

$$\begin{aligned} \text {If } v^1 \le v^2 \text { then } S_{v^1}(x) \le S_{v^2}(x) \;\;\; \forall x \in X \end{aligned}$$
(3)

2.3 The Set of Capacities Consistent with Preference Data

We now summarize the results presented in [12] about the identification of the family of Sugeno integrals consistent with a dataset of statements comparing alternatives to a global evaluation level. More precisely, we consider preference statements of the type “the global evaluation of x is higher or equal than a level \(\alpha \)” or “the global evaluation of y is lower or equal than a level \(\lambda \)”, and we want to derive the set of capacities consistent with such statements.

For a pair \((x, \alpha ) \in X \times L\), we define the capacities \(\check{v}_{x,\alpha } \) and \(\hat{v}_{x,\alpha }\) as follows.

Definition 1

Given \(x \in X\) and \(\alpha \in L\), the capacities \(\check{v}_{x,\alpha } \) and \(\hat{v}_{x,\alpha }\) are defined as:

$$\check{v}_{x,\alpha }(A)= \left\{ \begin{array}{l} 1_L \, \text{ if } A = \mathcal {C}\\ \alpha \, \text{ if } \{i \in C|x_i\ge \alpha \} \subseteq A \\ 0_L \, \text{ otherwise } \end{array} \right. \text{ and } \hat{v}_{x,\alpha }(A)= \left\{ \begin{array}{l} 0 \, \text{ if } A = \emptyset \\ \alpha \, \text{ if } A \subseteq \{i \in C|x_i> \alpha \} \\ 1_L \, \text{ otherwise }. \end{array} \right. $$

Note that we always have \(\check{v}_{x,\alpha } \le \hat{v}_{x,\alpha }\). Using \(\check{v}_{x,\alpha }\) and \(\hat{v}_{x,\alpha }\) we can determine the set of capacities (that is a sub-interval of the lattice of capacities) consistent with a statement of the type \(S_V(x)\ge \alpha \), \(S_v(x)\le \alpha \), or \(S_v(x) = \alpha \).

Proposition 2

The set of capacities satisfying the equation \(S_v(x)\ge \alpha \) is:

$$\begin{aligned} \{v \in V| S_v(x) \ge \alpha \} = \{ v \in V | \check{v}_{x,\alpha } \le v \le \top _V \}= [\check{v}_{x,\alpha },\top _V ]_V \end{aligned}$$
(4)

while the set of capacities satisfying the equation \(S_v(x)\le \alpha \) is:

$$\begin{aligned} \{v \in V| S_v(x) \le \lambda \} = \{ v \in V | \bot _V \le v \le \hat{v}_{x,\lambda } \}= [\bot _V,\hat{v}_{x,\alpha }]_V . \end{aligned}$$
(5)

Therefore, the set of capacities satisfying \(S_v(x)=\alpha \) is:

$$\{v \in V| S_v(x)=\alpha \}=\{v \in V| \check{v}_{x,\alpha } \le v \le \hat{v}_{x,\alpha }\}= [\check{v}_{x,\alpha },\hat{v}_{x,\alpha }]_V .$$

In [12] the authors focus in considering a set \(\mathcal {P}\) of assignments of alternatives to global evaluations, that is the constraints \(S_v(x^k)=\alpha _k\) for \(k=1,\ldots ,m\). The set of the capacities compatible with all assignments in \(\mathcal {P}\) is

$$V^\mathcal {P}= \Big \{v \in V \Big | \bigvee _{k=1}^m \check{v}_{x^k,\alpha _k} \le v \le \bigwedge _{k=1}^m \hat{v}_{x^k,\alpha _k} \Big \}= \Big [ \bigvee _{k=1}^m \check{v}_{x^k,\alpha _k} ,\bigwedge _{k=1}^m \hat{v}_{x^k,\alpha _k} \Big ]_V.$$

In order to know if the set of capacities \(V^\mathcal {P}\) consistent with the preferences \(\mathcal {P}\), is empty it is not necessary to compare the capacities \(\vee _{k=1}^m \check{v}_{x^k,\alpha _k} \) and \( \wedge _{k=1}^m \hat{v}_{x^k,\alpha _k}\) for all subsets of criteria A since it is is proved in [12] the following property (that makes use of Proposition 1).

Proposition 3

The set of capacities \(V_\mathcal {P}= [ \vee _{k=1}^m \check{v}_{x^k\alpha _k} ,\wedge _{k=1}^m \hat{v}_{x^k,\alpha _k}]_V \) is not empty if and only if for all \(\alpha _ k < \alpha _l\) we have \(\{ i| x^l_i \ge \alpha _l\} \not \subseteq \{ i| x^k_i > \alpha _k\}\).

In this work we do not assume that all input statements are assignments, but we collect, using an interactive process, statements of the type \(S_v(x)\ge \alpha \) or \(S_v(x)\le \alpha \). Let \(\mathcal {P}\) be divided in two parts \((x^k,\alpha _k)_{k=1,\ldots ,m_1} \) and \((y^k, \lambda _k)_{k=1,\ldots ,m_2}\) such that the global evaluation of \(x^k\) is bigger than \(\alpha _k\) and the global evaluation of \(y^k\) is lower than \(\lambda _k\). Hence the set of consistent capacities \(V^{\mathcal {P}}\) is:

$$\begin{aligned} V^\mathcal {P}&= \Big \{v \in V \Big | \bigvee _k \check{v}_{x^k, \alpha _k } \le v \Big \} \cap \Big \{v \in V | v \le \bigwedge _k \hat{v}_{y^k, \lambda _k} \Big \}\end{aligned}$$
(6)
$$\begin{aligned}&= \Big [ \bigvee _{k=1}^{m_1} \check{v}_{x^k,\alpha _k} ,\bigwedge _{k=1}^{m_2} \hat{v}_{y^k,\lambda _k} \Big ]_V. \end{aligned}$$
(7)

Note that this intersection can be empty. We will see, in the next section, that this intersection is always non empty with the proposed algorithm.

We conclude this section with a remark concerning the focal sets of a capacity. The qualitative Moebius transform of a capacity v is the set function \(v_\#\) defined as follows:

$$ v_\#(A) = \left\{ \begin{array}{l} v(A) \text{ if } v(B) <v(A) \;\; \forall B \subset A \\ 0 \text{ otherwise } \end{array} \right. $$

The sets A such that \(v_\#(A) >0\) are call the focal sets of v. The qualitative Moebius transform contains all the information to compute v since for all A, \( v(A) = \vee _{B \subseteq A} v_\#(B)\), and the qualitative Moebius transform is sufficient to calculate the Sugeno integral:

$$\begin{aligned} S_v (x )= \max _{A\subseteq \mathcal {C}} \min (v_\#(A), \min _{i\in A} x_i) \end{aligned}$$
(8)

This means that we just need to identify the focal sets in order to calculate the Sugeno integral. The preferences, described above, may be just given for objects with local evaluations equal to \(0_L\) or \(1_L\). Nevertheless, in practice these theoretical objects could be inappropriate. For instance, imagine the situation of caregivers assessing the overall health of given patients: it would be difficult for them to assess abstract patients without referring to real cases. This difficulty of reasoning with abstract items is the reason why we decide to not use focal sets in the method proposed in this paper.

3 Incremental Elicitation Protocol

We provide an interactive elicitation method based on the maximin decision criteria. The goal of the elicitation is to determine a necessary winner.

Our method bears similarity to methods, relying on minimax regret, for the incremental elicitation of a capacity for the Choquet integral [2]. We note that, in our qualitative framework, minimax regret is not applicable since the difference of two Sugeno value is meaningless in decision context.

First of all, in Sect. 3.1, we introduce some concept of decision-making under uncertainty to be used to identify the most promising alternative when the capacity is not known precisely. We focus in the case where the capacity lies in an interval between a lower and a upper capacity.

Then in Sect. 3.2 we use these concepts to design our interactive elicitation protocol.

3.1 Reasoning with an Uncertain Capacity

Suppose now that a set \(\mathcal {P}\) of statements have been collected and that the set of capacities \(V^\mathcal {P}\subseteq V\) consistent with \(\mathcal {P}\) has been identified; each \(v \in V^{\mathcal {P}}\) is such that v satisfies all preferences in \(\mathcal {P}\).

First of all, we observe that, in some cases, the set \(\mathcal {P}\) is enough to identify the best alternative in X. An alternative is a necessary winner if it is the optimal alternative with respect to all capacities in \(V^{\mathcal {P}}\).

Definition 2

A necessary winner with respect to \(\mathcal {P}\) is an alternative \(x \in \mathcal {X}\) such that

$$ x \in \arg \max _{x \in X} S_v(x) \quad \quad \forall v \in V^{\mathcal {P}}. $$

If a necessary winner exists, it is not necessary to elicit further information from the decision maker in order to make a decision, since the current available information is enough to identify the best choice (or one of the best choices, in case of ties) among the set of alternatives.

In most cases, however, a necessary winner does not exists. When it is necessary to make a choice with the only knowledge that the capacity lies in \(V^{\mathcal {P}}\), we can recommend the alternative(s) ensuring the highest Sugeno value in the worst-case. We therefore adopt the maximin criterion (similarly to a previous work in multiattribute decision making [15]), that is particularly apt to the ordinal settings where the Sugeno integral is typically used.

Given a set of capacities \(V^{\mathcal {P}}\) and an alternative \(x \in X\), the minimum (or pessimistic) value according to Sugeno is \(s^{\downarrow }_{\mathcal {P}}(x) = \min _{v \in V^{\mathcal {P}}} S_v(x)\) while its maximum (or optimistic) value is \(s^{\uparrow }_{\mathcal {P}}(y)= \max _{v \in V^{\mathcal {P}}} S_v(x)\). We now define the maximin Sugeno value \(s^{*}_{\mathcal {P}}\) as

$$\begin{aligned} s^{*}_{\mathcal {P}}= \max _{x \in X} s^{\downarrow }_{\mathcal {P}}(x) = \max _{x \in X} \min _{v \in V^{\mathcal {P}}} S_v(x) \end{aligned}$$

and a maximin recommendation \(x^{*}_{\mathcal {P}}\) is such that:

$$\begin{aligned} x^{*}_{\mathcal {P}}\in \arg \max _{x \in X} s^{\downarrow }_{\mathcal {P}}(x) = \arg \max _{x \in X} \min _{v \in V^{\mathcal {P}}} S_v(x) \end{aligned}$$

and \(x^{*}_{\mathcal {P}}\) is said to be maximin optimal (\(x^{*}_{\mathcal {P}}\) has the highest “pessimistic” value). The value \(s^{\downarrow }_{\mathcal {P}}(x)\) is the worst-case “utility” associated with recommending alternative x; any choice that is not maximin optimal has strictly lower Sugeno value than \(x^*\) for some capacity \(v \in V^{\mathcal {P}}\).

We further assume that all preference statements are of the kind \(S(x) \ge \alpha \) or \(S(x) \le \alpha \). Then, as we have seen previously in Sect. 2.3, the set of capacities consistent with \(\mathcal {P}\) can be written as the intersection of intervals. We will see that, using the proposed algorithm, this set of capacities is a lattice interval; i.e.,

$$ V^\mathcal {P}= [\check{v}, \hat{v}]_V. $$

Given an interval of capacities, the maximin alternative (the choice that maximizes the worst-case Sugeno value) is easily found: using the property described in Eq. 3 and the fact that the set of the valid capacities is a lattice, \(s^{\downarrow }_{\mathcal {P}}(x)\), the minimum Sugeno value of an alternative x, is just \(S_{\check{v}}(x)\), the Sugeno integral of x computed with the “bottom” capacity \(\check{v}\).

Proposition 4

Assuming that the set of feasible capacities \(V^{\mathcal {P}}\) is an interval (sublattice) of V, then we have:

$$\begin{aligned} s^{\downarrow }_{\mathcal {P}}(x) = S_{\check{v}}(x) \text {, } s^{*}_{\mathcal {P}}= \max _{x \in X} S_{\check{v}}(x) \text { and } x^{*}_{\mathcal {P}}= \arg \max _{x \in X} S_{\check{v}}(x) \end{aligned}$$
(9)

We need a measure of how “uncertain” we are with respect to our recommendation. Now, we consider the most optimistic Sugeno value, i.e. the maximax value, that can be attained by any alternative y different from the recommendation \(x^{*}_{\mathcal {P}}\).

$$\begin{aligned} s^{\circ }_{\mathcal {P}}= \max _{y \ne x^{*}_{\mathcal {P}}} s^{\uparrow }_{\mathcal {P}}(y) = \max _{y \ne x^{*}_{\mathcal {P}}} \max _{v \in V^{\mathcal {P}}} S_v(y) = \max _{y \ne x^{*}_{\mathcal {P}}} S_{\hat{v}}(y) \\ y^{\circ }_{\mathcal {P}}\in \arg \max _{y \ne x^{*}_{\mathcal {P}}} s^{\uparrow }_{\mathcal {P}}(y) = \arg \max _{y \ne x^{*}_{\mathcal {P}}} \max _{v \in V^{\mathcal {P}}} S_v(y) = \arg \max _{y \ne x^{*}_{\mathcal {P}}} S_{\hat{v}}(y) \end{aligned}$$

We dub \(y^{\circ }_{\mathcal {P}}\) as the “adversary”, since it is the alternative that may have the highest value. Recall that \(s^{*}_{\mathcal {P}}\) is the value of the maximin optimal recommendations. By comparing \(s^{*}_{\mathcal {P}}\) and \(s^{\circ }_{\mathcal {P}}\) we determine whether there is any residual uncertainty about which is the optimal alternative. We notice that if \(s^{*}_{\mathcal {P}}\ge s^{\circ }_{\mathcal {P}}\) then it means that the current maximin recommendation \(x^{*}_{\mathcal {P}}\) is surely an optimal recommendation. This observation is formally stated in the following proposition, whose proof is very straightforward.

Proposition 5

If \(s^{*}_{\mathcal {P}}\ge s^{\circ }_{\mathcal {P}}\) then \(x^{*}_{\mathcal {P}}\) is a necessary winner.

Proof

For all \(v \in V^{\mathcal {P}}\), \(S_v(x^{*}_{\mathcal {P}}) \ge s^{*}_{\mathcal {P}}\), and \(s^{\circ }_{\mathcal {P}}\ge s^{\uparrow }_{\mathcal {P}}(y) \ge S_v(y)\), for all \(y \ne x^{*}_{\mathcal {P}}\). Therefore, if \(s^{*}_{\mathcal {P}}\ge s^{\circ }_{\mathcal {P}}\), then by transitivity we have \(S_v(x^{*}_{\mathcal {P}}) \ge S_v(y)\) for all \(v \in V^{\mathcal {P}}\) and for all \(y \ne x^{*}_{\mathcal {P}}\).

Example 1

Suppose that the available alternatives are \(X=\{a,b,c,d\}\) whose performances are given in the following table of criteria.

Alternative

Criteria

\(\;1\;\)

\(\;2\;\)

\(\;3\;\)

a

0.2

0.4

0.5

b

0.7

0.2

0.4

c

0.1

1

0.7

d

0

0.5

0

The scale is \(L =\{ 0, 0.1, 0.2 , \ldots , 0.9, 1 \} \). Assume that we know that alternative a is deemed to have value higher or equal than 0.4, that alternative b has Sugeno value at least 0.5, and that alternative c has Sugeno value at most 0.8; that is \(S_v(a)\ge 0.4\), \(S_v(b) \ge 0.5\), and \(S_v(c) \le 0.8\). We now inspect the lower bound \(\check{v}\) and the upper bound \(\hat{v}\) capacities, based on combining Eqs. 4 and 5.

Subset

\(\emptyset \)

\(\{ 1 \}\;\)

\(\{ 2 \}\;\)

\(\{ 3 \}\;\)

\(\{ 1,2 \}\;\)

\(\{ 1,3 \}\;\)

\(\{ 2,3 \}\;\)

\(\{ 1, 2,3 \}\)

\(\check{v}(\cdot )\)

0

0.5

0

0

0.5

0.5

0.4

1

\(\hat{v}(\cdot )\)

0

1

0.8

1

1

1

1

1

We determine the optimistic \(s^{\uparrow }_{\mathcal {P}}\) and the pessimistic value \(s^{\downarrow }_{\mathcal {P}}\) of each alternative by computing the Sugeno integral of abc with respect to \(\check{v}, \hat{v}\).

Alternative

a

b

c

d

\(s^{\downarrow }_{\mathcal {P}}(\cdot )\)

0.4

0.5

0.4

0

\(s^{\uparrow }_{\mathcal {P}}(\cdot )\)

0.5

0.7

0.8

0.5

We can determine that b attains the maximin optimal value \(s^{*}_{\mathcal {P}}=s^{\downarrow }_{\mathcal {P}}(b)=0.5\), while the adversary is c that can obtain up to \(s^{\circ }_{\mathcal {P}}=s^{\uparrow }_{\mathcal {P}}(c)=0.8\) in the optimistic case.

We conclude this part by observing that the condition in the proposition above gives us a sufficient condition for detecting a necessary winner, but not a necessary oneFootnote 1: it is possible that a necessary winner exists even when such condition is not satisfied (see example below). This means that, in some cases, the interactive approach that we present next may pose some questions that could be avoided with a more precise check for determining a necessary winner. However, by proceeding in this way we keep the algorithm rather simple and efficient.

Example 1 (continued)

Now consider the set of alternatives to be restricted to c and d. Note that c dominates d, that is, the former has a strictly higher performance than the latter with respect to all three criteria; it follows that Sugeno of c is higher than the value of d. Alternative c is a necessary winner in \(X' = \{ c,d \}\). However, we have \(s^{\downarrow }_{\mathcal {P}}(c) = 0.4 < 0.5 = s^{\uparrow }_{\mathcal {P}}(d)\) and the condition of Proposition 5 is not met.

figure a

3.2 An Interactive Elicitation Scheme for Determining a Necessary Winner

This section proposes an interactive elicitation process based on the concepts introduced above; Algorithm 1 depicts the pseudocode of our procedure. The input parameters are X, the dataset, and the preference statements \(\mathcal {P}\). During the course of the process, we maintain an explicit representation of the set \(V^{\mathcal {P}}\) of feasible capacities.

The pair \((\check{v}, \hat{v})\) is initialized depending on \(\mathcal {P}\):

  • In the case that we start from an empty set of statements (\(\mathcal {P}= \emptyset \)) we initialize the pair \((\check{v}, \hat{v})=(\bot ,\top )\). These capacities entails particular cases for Sugeno integral: \(S_\bot (x) = \min _{i =1}^n x_i\) and \(S_\top (x) = \max _{i =1}^n x_i\).

  • If \(\mathcal {P}\) is not empty, for each \(p \in \mathcal {P}\), we use Eq. 7 in order to initialise \((\check{v}, \hat{v})\).

At each step of the elicitation, a query is asked and a new statement is acquired. Based on this the lattice is updated. We then we compute the new maximin optimal alternative ensuring the highest value \(s^{\downarrow }_{\mathcal {P}}\).

Questions are chosen considering the value \(s^{*}_{\mathcal {P}}\) of the maximin alternative, given preferences \(\mathcal {P}\), and the value of \(y^{\circ }_{\mathcal {P}}\) that is the alternative, different than x, that have the highest Sugeno value. Proposition 5 gives us a termination condition for ending the elicitation process.

The function Update updates the lower bound \(\check{v}\) and the upper bound \(\hat{v}\) of the lattice of capacities consistent with the current information. That is,

$$\begin{aligned} \check{v}&:= \check{v} \vee \check{v}_{x, \text {succ}(\alpha )}&\text { if } p \text { of type } S(x) > \alpha \\ \hat{v}&:= \hat{v} \wedge \hat{v}_{x, \alpha }&\text { if } p \text { of type } S(x) \le \alpha \end{aligned}$$

where \(\text {succ}(\alpha )\) is the level in L right above \(\alpha \); see Definition 1 for how \(\check{v}_{x, \alpha }\) and \(\hat{v}_{x, \alpha }\) are defined.

A question is identified by a pair \((x, \alpha )\): the alternative x that we are asking about, and the level \(\alpha \). The space of possible queries at a given step of the elicitation is

$$Q(\check{v},\hat{v}) = \{ (x,\alpha ) | x \in X, S_{\check{v}(x)} \le \alpha \le S_{\hat{v}(x)} \}.$$

We now formally state a property ensuring that the algorithm cannot lead to an empty set of feasible capacities.

Proposition 6

During all the steps of procedure InteractiveElic we have \(\check{v} \le \hat{v}\) (that means the sublattice of capacities that they represent is not empty).

Proof

The property is true when we start the procedure. The proof is based on showing that the property is still satisfied after updating the lattice of capacities.

Suppose to have \(\check{v}\) and \(\hat{v}\) with \(\check{v} \le \hat{v}\). Consider any pair \((x,\alpha ) \in Q(\check{v},\hat{v})\), i.e., x and \(\alpha \) satisfy \(S_{\check{v}}(x) \le \alpha \le S_{\hat{v}}(x).\) There are two possible answers we want to have either \(S_v(x) \ge \alpha \) or \(S_v(x) \le \alpha .\) Hence we update the bounds of the set of capacities solution. In the first case the lower bound of the set of the capacities solution changes, while in the second one it is the upper bound.

  • Suppose to have \(S_v(x) \ge \alpha \). Let us denote the new lower bound by \(\check{v}'\),

    $$\check{v}'(A)= \check{v}(A) \vee \check{v}_{x, \alpha } (A)= \left\{ \begin{array}{l} \max (\check{v}(A),\alpha ) \text{ if } \{ i| x_i \ge \alpha \} \subseteq A\\ \check{v}(A) \text{ otherwise } \end{array} \right. $$

    Let us prove that \(\check{v}'\le \hat{v}\). We have \( \check{v} \le \hat{v}\) so we just need to prove that \(\hat{v}(A) \ge \alpha \) if \(\{i| x_i \ge \alpha \} \subseteq A\): We have

    $$S_{\hat{v}}(x)= \max _{ \beta \in L } \min ( \hat{v}(\{i|x_i \ge \beta \}), \beta ) \ge \alpha ,$$

    so there exists \(\beta \ge \alpha \) such that \( \hat{v}(\{i|x_i \ge \beta \}) \ge \beta \). We have \(\{i|x_i \ge \beta \} \subseteq \{i|x_i \ge \alpha \}\) which entails \( \hat{v}(\{i|x_i \ge \alpha \}) \ge \hat{v}(\{i|x_i \ge \beta \}) \ge \beta \ge \alpha .\) We conclude using the monotonicity of \( \hat{v}\).

  • Suppose to have \(S_v(x) \le \alpha \). Let us denote the new upper bound by \(\hat{v}'\),

    $$\hat{v}'(A)= \hat{v}(A) \wedge \hat{v}_{x, \alpha } (A)= \left\{ \begin{array}{l} \min ( \hat{v}(A), \alpha ) \text{ if } A \subseteq \{ i| x_i > \alpha \}\\ \hat{v}(A) \text{ otherwise } \end{array} \right. $$

    Let us prove that \(\check{v} \le \hat{v}'\). We have \( \check{v} \le \hat{v}\) so we just need to prove that \(\check{v}(A) \le \alpha \) if \( A \subseteq \{i| x_i > \alpha \}\): We have

    $$S_{\check{v}}(x)= \min _{ \beta \in L } \max ( \check{v}(\{i|x_i > \beta \}), \beta ) \le \alpha ,$$

    so there exists \(\beta \le \alpha \) such that \( \check{v}(\{i|x_i > \beta \}) \le \beta \). We have \(\{i|x_i> \alpha \} \subseteq \{i|x_i > \beta \}\) which entails \( \check{v}(\{i|x_i> \alpha \}) \le \check{v}(\{i|x_i > \beta \}) \le \beta \le \alpha .\) We conclude using the monotonicity of \( \check{v}\).

Fig. 1.
figure 1

The CSS1 strategy analyzes the different relative positions of the upper bounds and lower bounds of \(x^{*}_{\mathcal {P}}\) and \(y^{\circ }_{\mathcal {P}}\).

3.3 Strategies to Choose the Next Question

We now address the problem of choosing the next question. This is an important point since a good strategy for asking questions will reduce the length of the elicitation process and as well mitigate the cognitive effort of the user. We consider different strategies to select the next question based on the current lattice of valid capacities. The effectiveness of these strategies are evaluated in simulation (see Sect. 4).

The Current solution strategy (CSS) uses the information about the current best recommendation \(x^{*}_{\mathcal {P}}\), and the “adversary” \(y^{\circ }_{\mathcal {P}}\) to derive a question to ask. We propose two versions of this idea:

  • CSS0 (simpler version): we simply choose to ask about \(x^{*}_{\mathcal {P}}\) or \(y^{\circ }_{\mathcal {P}}\) depending on which has the largest interval, and as level we pick the midpoint.

  • CSS1 ( more elaborate): we evaluate candidate queries with respect to their capability of resolving the uncertainty about which between \(x^{*}_{\mathcal {P}}\) and \(y^{\circ }_{\mathcal {P}}\) has the highest Sugeno value. The discussion depends on how the intervals \([s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})]\) and \([s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})]\) relate to each other. It is worth noticing that we know that \(s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}) \ge s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\) by definition of maximin. We inspect the order between \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\) and \(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\) to decide which queries to consider. We then propose an heuristic in order to choose between these possible queries based on the length of the intervals \([s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})]\) and \([s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})]\) as depicted in Fig. 1. Note that, in the following discussion, we denote by \(d ( \alpha ,\beta ) \) the number of levels between \(\alpha \) and \(\beta \) where \(\alpha \) and \(\beta \) are elements on the scale L.

    • Case i): \(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}) \le s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\), i.e., \( \bigg [s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}) \quad [s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}) \quad s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\bigg ] \quad s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})]\). The optimistic value of \(y^{\circ }_{\mathcal {P}}\) is lower or equal than the pessimistic value of \(x^{*}_{\mathcal {P}}\).

      In this case we could ask the user to compare alternative \(x^{*}_{\mathcal {P}}\) and the level \(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\). If the answer is that \(S_v(x^{*}_{\mathcal {P}}) \ge s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\), we know that \(y^{\circ }_{\mathcal {P}}\) cannot be better than \(x^{*}_{\mathcal {P}}\), and therefore we resolve the uncertainty between the two; this event happens if the true Sugeno value of \(x^{*}_{\mathcal {P}}\) is between \(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\) and \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\). We then quantify the “score” of this query as the proportion of the interval of \([s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})]\) that makes us certain that \(x^{*}_{\mathcal {P}}\) is preferred to \(y^{\circ }_{\mathcal {P}}\), i.e. the number of levels between \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\) and \(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\) divided by the number of levels between \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\) and \(s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\). Hence the value of this query is \( \frac{d(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))}{d(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}))}\).

      Alternatively, we could also ask to compare alternative \(y^{\circ }_{\mathcal {P}}\) and the level \(s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\). If the user states that the Sugeno value of \(y^{\circ }_{\mathcal {P}}\) is lower than \(s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\), then we can also conclude that \(y^{\circ }_{\mathcal {P}}\) cannot be better than \(x^{*}_{\mathcal {P}}\). Reasoning as above, we score this query \(\frac{d(s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))}{d(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))}\).

      We ask the query (among the two) that has the highest “value”.

    • Case ii): \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}) \le s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\), i.e., \( \bigg [s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}) \quad [s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}) \quad s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}) ]\quad s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\bigg ]\). (the optimistic value of \(y^{\circ }_{\mathcal {P}}\) is at least the pessimistic value of \(x^{*}_{\mathcal {P}}\)).

      We can ask to compare alternative \(y^{\circ }_{\mathcal {P}}\) and \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\), whose “score” is \(\frac{d(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}))}{d(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))}\), or, as in the previous case, to compare \(y^{\circ }_{\mathcal {P}}\) and \(s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\), with value \(\frac{d(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))}{d(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))}\). Since the denominator of both formulas is the same, the test reduces to checking \(d(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})) \ge d(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}),s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}))\).

    • Case iii): \(s^{\downarrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}) = s^{\downarrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\) and \(s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}}) = s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}})\). In this case we ask about \(x^{*}_{\mathcal {P}}\) and its midpoint level.

The Halve largest gap (HLG) asks the question about alternative \(x_{H}\) with the largest gap measured by the number of levels; \(x_{H} = \arg \max _{x \in X} d(s^{\downarrow }_{\mathcal {P}}(x),s^{\uparrow }_{\mathcal {P}}(x))\) and the level \(\alpha _{H}\) is the midpoint between \(s^{\uparrow }_{\mathcal {P}}(x_H)\) and \(s^{\downarrow }_{\mathcal {P}}(x_H)\).

The Random strategy chooses, as q query, an alternative x at random and a the midpoint between \(s^{\uparrow }_{\mathcal {P}}(x)\) and \(s^{\downarrow }_{\mathcal {P}}(x)\) (this strategy is considered as a baseline).

Example 1 (continued)

We show how the different strategies will determine the next query to ask in our example. Remember that \(x^{*}_{\mathcal {P}}=b\) and \(y^{\circ }_{\mathcal {P}}=c\). The strategy CSS0 asks about alternative c and its midpoint level 0.6, since the interval \([s^{\downarrow }_{\mathcal {P}}(b), s^{\uparrow }_{\mathcal {P}}(b)]=[0.5, 0.7]\) is smaller (in terms of number of levels) that the interval \([s^{\downarrow }_{\mathcal {P}}(c), s^{\uparrow }_{\mathcal {P}}(c)]=[0.4, 0.8]\).

Since \(s^{\uparrow }_{\mathcal {P}}(x^{*}_{\mathcal {P}}) = 0.7 < 0.8 = s^{\uparrow }_{\mathcal {P}}(y^{\circ }_{\mathcal {P}})\) the analysis performed by CSS1 proceeds by considering the second case. CSS1 asks either to compare alternative c with level 0.5 or to compare alternative c with level 0.7 (their “score” is the same).

HLG asks about alternative d, that has the widest gap \([s^{\downarrow }_{\mathcal {P}}(d),s^{\uparrow }_{\mathcal {P}}(d)]=[0,0.5]\), and its midpoint level 0.2. This query is not very informative since we know already that d cannot be strictly better than b.

Table 1. Simulation results (averaged over 30 runs) showing the number of queries that are needed in order to find a necessary winner.
Fig. 2.
figure 2

The values \(s^*\) and \(s^{\circ }\) as a function of the number of queries with CSS1 on the tiny dataset (averaged over 30 runs)

4 Experimental Results

We evaluate the proposed paradigm with numerical experiments where we simulate the elicitation process by assuming that the preferences of a decision-maker are consistent with a Sugeno integral with a capacity v. At each step of the simulation, a question of the type “Is \(S_v(x)\) lower or equal to \(\alpha \)?” is asked to decision maker. The simulated user answers such questions based on the “true” capacity v, and the answers are used by our algorithm to update the lattice of consistent capacities, determine the maximin recommendations and to select the question to ask next, as discussed before in Sect. 3.2.

In our tests we considered 3 different datasets: a very small dataset, dubbed “tiny” of 7 items and 4 criteria (with an evaluation scale of 20 levels), a randomly generated “synthetic” dataset (30 items, 8 criteria, 25 levels), and a dataset of “cars” (100 items, 6 criteria, 5 levels).

Simulated users answer queries according to capacities that are either WeightedMax, WeightedMin or generic capacities (note that the form of the capacity is not known to the elicitation algorithm). Capacities are randomly generated in the following way: for WeightedMax and WeightedMin the weight vector is uniformly sampled (one criteria forced to have weight \(1_L\)). For generic capacities, we iteratively pick a random subset of criteria and assign it a random level (sampled uniformly) between \(0_L\) and \(1_L\) with subsets and supersets updated accordingly to monotonicity; the process is repeated a fixed number of times.

We compare the effectiveness of the heuristic strategies (presented in the Sect. 3.2) for choosing the next question to ask to the user. In Table 1 we show the average number of queries that are needed to find a necessary winner according to the different query strategies, in the different simulation settings (all experiments have been repeated 30 runs).

In Fig. 2 we provide, for one of the experiments, the detail about how the values \(s^{*}_{\mathcal {P}}\) and \(s^{\circ }_{\mathcal {P}}\) evolve over time: the former is monotonically non decreasing, while the latter decreases most of the time. Note that our protocol can be terminated early providing a “good” recommendation before a necessary winner is found.

The experimental results show the superiority of the CSS strategies with respect to the other heuristics, with both CSS0 and CSS1 performing quite well.

5 Discussion and Conclusions

The Sugeno integral is used as an aggregation method for multicriteria decision making. Despite its popularity, the elicitation of a Sugeno integral is still a problematic issue. In this paper we have provided a novel formalization for decision-making under capacity uncertainty using the maximin utility criterion. We have provided an incremental elicitation method for determining a necessary winner; at each step the user is asked to answer a query of the type “Is the Sugeno value of item x at most \(\alpha \)?” where x and \(\alpha \) are dynamically chosen to improve the knowledge about the capacity as much as possible. The algorithm maintains a representation of the lattice of consistent capacities that is updated after each answer. We provided an experimental validation of the approach with simulations comparing different heuristics for choosing the next question to ask.

Several directions for future research are possible: additional strategies for choosing the next question (for instance adapting the ideas of [14] in an ordinal setting), experimentation with real data, the extension to different type of questions (e.g. comparing two alternatives) and handling combinatorial domains. We are also interested in methods that support the interpretation of real data, for instance by using if-then rules based on Sugeno integrals. As the Sugeno integral represents a single threshold rule [6], an interesting direction is to adapt our procedure for Sugeno Utility Functionals (SUF) [4].

Another challenge is the prediction of preferences [9]; one idea is to compute a family of capacities on a training set of preferences and use it for prediction. The analogy-based method [3] seems as well to be an approach to consider.