Sequential Decision-Making Under Uncertainty Using Hybrid Probability-Possibility Functions

Dubois, Didier; Fargier, Hélène; Guillaume, Romain; Rico, Agnès

doi:10.1007/978-3-030-85529-1_5

Didier Dubois¹⁰,
Hélène Fargier¹⁰,
Romain Guillaume¹⁰ &
…
Agnès Rico¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12898))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

539 Accesses
1 Citations

Abstract

Probabilistic and possibilistic models of sequential decision problems are known to possess good behavioral and algorithmic properties. In this paper, the range of models of problems of sequential decision under uncertainty that are dynamically consistent, consequentialist and allow for tree reduction is enlarged by considering a representation of uncertainty that is both probabilistic and possibilistic. The corresponding utility functional is expected utility for highly likely states, and an optimistic or pessimistic possibility-based criterion for unlikely states.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Algorithms for Multi-criteria Optimization in Possibilistic Decision Trees

Possibilistic Graphical Models: From Reasoning to Decision Making

Efficient Policies for Stationary Possibilistic Markov Decision Processes

Keywords

1 Introduction

In sequential decision making a strategy is a conditional plan that assigns a (possibly non deterministic) action to each state where a decision has to be made (also called “decision node”). Each strategy leads to a compound lottery, following Von Neuman and Morgenstern’s terminology [12] - roughly, it is a tree representing the different possible scenarios, and thus the different possible final states that the plan/strategy may reach. The optimal strategy is then the one that minimizes a criterion whose value depends on utilities of final states and the resulting compound lottery.

Three assumptions are desirable in order to accept an optimal strategy without questioning its meaning. Those assumptions are:

Dynamic Consistency: when reaching a decision node, following an optimal strategy, the best decision at this node is the one that had been considered so when computing this strategy, i.e. prior to applying it.
Consequentialism: the best decision at each step of the problem only depends on potential consequences at this point.
Tree Reduction: a compound lottery is equivalent to a simple one, assigning probabilities to final states.

Those three assumptions are instrumental to enable an optimal strategy to be computed using dynamic programming [10].

When the decision maker is able to provide a probability distribution on the possible states and considers that the utilities are additive, a classical approach is the one based on expected utility. Under this strong assumption, the above three assumptions are satisfied.

When the problem is pervaded with possibilistic uncertainty, two families of criteria make sense: a Sugeno integral-based criterion [5] and the Choquet integral [6]. The criterion based on Choquet integral turns out to be incompatible with the above assumptions: it may happen that none of the optimal strategies is dynamically consistent nor consequentialist The possibilistic criteria based on Sugeno integral do not meet such difficulties. Nor does the generalization of both optimistic and pessimistic possibilistic Sugeno proposed by [7]. It aggregates optimistic and pessimistic criteria by means of a uninorm [13], a semi-group operation whose identity plays the role of a degree of optimism. Contrary to other criteria that account for a degree of optimism, like the Hurwicz criterion, the criterion proposed in [7] satisfies the three assumptions governing a good behavior of the decision tree.

In the present paper, we are looking for new decision criteria, beyond expected utility and possibilistic integrals, that can apply to decision trees and respect the three properties recalled above (Dynamic Consistency, Consequentialism and Tree Reduction).

2 Decision Trees

A convenient language to introduce sequential decision problems is the one of decision trees [10]. This framework proposes an explicit graphical model, representing each possible scenario by a path from the root to the leaves of a tree. Formally, a decision tree $\mathcal {T} = (\mathcal {N}, \mathcal {E})$ is such that $\mathcal {N}$ contains three kinds of nodes (see Fig. 1 for an example):

$\mathcal {D}$ is the set of decision nodes (depicted by rectangles).
$\mathcal {LN} $ is the set of leaves, that represent final states in $\mathcal {S}$; such states can be evaluated by a utility function: $\forall s_i \in \mathcal {S}$, $u(s_i)$ is the degree of satisfaction of eventually being in state $s_i$ (of reaching the corresponding node in $\mathcal {LN}$). For the sake of simplicity we assume, without loss of generality, that only leaf nodes are attached utilities.
$\mathcal {X}$ is the set of chance nodes (depicted by circles).

For any node $n_{i} \in \mathcal {N}$, $Succ(n_{i}) \subseteq \mathcal {N}$ denotes the set of its children. In a decision tree, for any decision node $d_i, Succ(d_i) \subseteq \mathcal {X}$: $Succ(d_i)$ is the set of actions that can be chosen when $d_i $ is reached. For any chance node $x_i, Succ(x_i) \subseteq \mathcal {LN} \cup \mathcal {D}$: $Succ(x_i)$ is the set of possible outcomes of action $x_i $ - either a leaf node is observed, or a decision node is reached (and then a new action should be chosen).

Solving a decision tree amounts to building a strategy, i.e. a function $\delta $ that associates to each decision node $d_i$ an action (i.e. a chance node) in $Succ(d_i)$: $\delta (d_i)$ is the action to be executed when decision node $d_i $ is reached. Let $\varDelta $ be the set of strategies that can be built for $\mathcal {T}$. We shall also consider the subtree $\mathcal {T}_n$ of $\mathcal {T}$ rooted at node $n \in \mathcal {T}$, and denote by $\varDelta _n$ its strategies: they are subtrategies of the strategies of $\varDelta $.

The satisfaction of the decision maker for a consequence (on a leaf of the tree) is captured by a utility degree on a totally ordered scale. The scale [0, 1] is generally chosen for these degrees, but any ordered set can be used.

Here, we focus on the case where the information at each chance node is fully captured by a distribution over outcomes of the chance nodes, namely a possibility distribution $\pi $ and/or a probability distribution p. When bearing on the leaf nodes, such distributions define simple lotteries on the utility degrees. More formally:

A simple probabilistic lottery $L^{p}$ [12]: is a probability distribution p on a set of utility degrees, $\varLambda = \{\lambda _1,..., \lambda _n\}$. The probabilistic lotteries will be written as $L^{p}= {<}p_1/\lambda _1,..., p_n / \lambda _n {>}$ with $p_i\in [0,1], \sum _{i=1}^{n}p_i=1$.
A simple possibilistic lottery $L^{\pi }$ [5] is a normalized possibility distribution $\pi $ on a set of utility degrees, $\varLambda $, both being expressed in the same ordered scale. The possibilistic lotteries will be written as $L^{p}= {<}\pi _1/\lambda _1,..., \pi _n / \lambda _n {>}$ with $\pi _i\in [0,1], \max _{i=1}^{n}\pi _i=1$.

In a lottery $L^{\pi }$ (resp. $L^p$), the value $\pi _i$ (resp. $p_i$) is the possibility (resp. probability) degree of getting utility $\lambda _i$ according to the decision strategy captured by the lottery. For the sake of brevity, the $\lambda _i$’s such that $\pi _i = 0$ (resp. $p_i=0$) are often omitted in the notation of a lottery (e.g., $({<}1/0.8{>})$ denotes the lottery that provides utility 0.8 for sure, all the other utility degrees being impossible).

The expected utility of probabilistic simple lotteries was proposed by Von Neuman and Morgernstern [12] as a decision criterion under risk: $E(L^{p}) = \sum _{\lambda _i \in \varLambda } \lambda _i\cdot p_i $. Dubois and Prade [5] proposed to use optimistic and pessimistic possibilistic criteria, denoted by $U^{Pes}$ and $U^{Opt}$ in this paper, to evaluate the global utility of a possibilistic lottery using $U^{Pes}(L^\pi ) = \min _{\lambda _i \in \varLambda } \max (1 - \pi _i, \lambda _i)$, and $ U^{Opt}(L^\pi )= \max _{\lambda _i \in \varLambda } \min ( \pi _i, \lambda _i)$.

Let us now consider full-fledged strategies. A strategy in $\varDelta $ can be viewed as a connected subtree of $ \mathcal {T}$ where there is exactly one edge (and thus one chance node) left at each decision node - skipping the decision nodes, we get a chance tree or, using von Neuwman and Morgernstern’s terminology, a compound lottery: like simple lotteries, which are distributions over utilities, compound lotteries are distributions over (simple or compound) lotteries.

The idea is to define a simple lottery equivalent to the original, compound one and to apply the decision criterion to this simple lottery.

Definition 1

($Reduction^{p}$). For any probabilistic compound lottery of the form $ L^{p}= {<}p_1 / L^{p}_1, \dots , p_m / L^{p}_m{>}$, $Red^{p}(L^{p})$ is the simple lottery that associates to each $\lambda _i$ the probability degree $\overline{p}_i = \sum _{ j=1,...,m} p_j\cdot p^j_i$, where $p^j_i$ denotes the probability of getting $\lambda _i$ though lottery $L^{p}_j $ and $p_j$ the probability of getting $L^{p}_j$.

Definition 2

($Reduction^{\pi }$). For any possibilistic compound lottery of the form $ L^{\pi }= {<}\pi _1 / L^{\pi }_1, \dots , \pi _m / L^{\pi }_m{>}$, $Red^{\pi }(L^{\pi })$ is the simple lottery that associates to each $\lambda _i$ the possibility degree $\overline{\pi }_i = \max _{j=1,...,m} \min (\pi _j, \pi ^j_i)$, where $\pi ^j_i$ denotes the possibility of getting $\lambda _i$ though lottery $L^{\pi }_j $ and $\pi _j$ the possibility of getting $L_j^\pi $.

The principle of lottery reduction allows the comparison of compound lotteries: L is preferred to $L'$ iff its reduction is preferred to the one of $L'$ and optimality can then be soundly defined:

$\delta \in \varDelta $ is optimal for a decision tree $\mathcal {T}$ iff $\forall \delta ' \in \varDelta , C(Red(L_\delta )) \succeq C(Red(L_{\delta '}))$,

for a criterion C. The principle of monotonicity and the one of decomposition [12] are valid for expected utility, and also for $U^{Pes}$ and $U^{Opt}$ in a weak form:

Definition 3

(Weak monotonicity). A preference criterion C over possibilistic/probabilistic lotteries is said to be weakly monotonic iff whatever L, $L'$ and $L''$, whatever a normalized possibility/probability distribution w, v:

$$\begin{aligned} C(L) \le C(L') \Rightarrow C({<}w/L, v/L''{>}) \le C({<}w/L', v/L''{>}). \end{aligned}$$

(1)

Importanty, all approaches that satisfy weak monotonicity, and in particular in the approach considered in this paper, also satisfy Dynamic Consistency, Consequentialism and Tree Reduction. This guarantees coherence with the intuition of rationality; this is also important from the algorithmic point of view, since it allows to find an optimal strategy by dynamic programming.

In the following we consider decision trees where uncertainty is captured by set functions that are more general than probability and possibility measures, while taking into account a degree of optimism of the decision maker, although without giving up the monotonicity principle.

3 Hybrid Possibility-Probability Measures

The three previous criteria (expected utility, pessimistic and optimistic possibilistic utility functionals) are particular instances of generalized integrals (Choquet integral for expected utility, Sugeno integral for the other ones) based on fuzzy measures.

Definition 4

A fuzzy measure is a set function $\mu :2^S\rightarrow [0,1]$, satisfying the following axioms:

$\mu (\emptyset )=0$; $\mu (S)=1$ (limit conditions)
$\mu (A\cup B)\ge \mu (A)$ (monotony)

Fuzzy measures include, among others, probability measures, necessity measures and possibility measures.^{Footnote 1}

However, we look for special fuzzy measures representable by means of lotteries (or, equivalently, by distributions on utility values). Hence they must be decomposable:

Definition 5

[4]. A decomposable fuzzy measure is a fuzzy measure $\mu $ for which there is a t-conorm^{Footnote 2} S such that $\mu (A\cup B) = S(\mu (A), \mu (B))$ whenever A and B are disjoint.

Possibility measures are $\max $-decomposable (for the t-conorm $\max $), while probability measures are additively decomposable using the Łukasiewicz t-conorm $\min (a+b, 1)$. To define lottery reduction, we also need a generalization of the notion of independence between events.

Definition 6

Let T be a triangular norm. Two events A and B are said to be T-separable with respect to a fuzzy measure $\mu $ if and only if $\mu (A\cap B) = T(\mu (A), \mu (B))$ for a t-norm T.

If A and B are disjoint events T-separable from another event C, then, since $(A\cup B)\cap C = (A\cap C)\cup (B\cap C)$, a decomposable fuzzy measure should satisfy

$$ T(S(\mu (A), \mu (B)), \mu (C)) = S(T(\mu (A),\mu (C)), T(\mu (B),\mu (C)),$$

which requires a property called Conditional Distributivity of T over S. This property is essential for enabling the reduction of generalised lotteries.

Definition 7

A t-norm T is conditionally distributive over a t-conorm S if for all $x,y,z\in [0,1]$ we have

$$(CD) \,T(x,S(y,z))=S(T(x,y),T(x,z)), \text { whenever } S(y,z)<1.$$

In [3], it has been proved that the only family of pairs (t-conorm, t-norm) that satisfy the condition (CD) are built from the pairs $(\max , T)$ and $(\min (a+b, 1), \times )$ [3, 9], where T is an arbitrary continuous t-norm. It is a parametric family of pairs denoted by $(S^\alpha , T^\alpha )$, with parameter $\alpha $, where $S^\alpha $ (resp. $T^\alpha $) is the ordinal sum [9] of $\max $ and $\min (x+y,1)$ (resp. T and product) represented on Fig. 2 for $T = \min $.

The pairs $(S^\alpha , T^\alpha )$ where $T^\alpha $ is distributive on $S^\alpha $ are thus of the form

$$\begin{aligned} \begin{array}{c} S^{\alpha }(x,y)= {\left\{ \begin{array}{ll} x+y-\alpha \text { if } x> \alpha , y> \alpha \\ \max (x,y) \end{array}\right. } \end{array} \end{aligned}$$

(2)

$$\begin{aligned} \begin{array}{c} T^{\alpha }(x,y)= {\left\{ \begin{array}{ll} \alpha +\frac{(x-\alpha )(y-\alpha )}{1-\alpha }\text { if } x> \alpha , y> \alpha \\ \min (x,y) \text { otherwise.} \end{array}\right. } \end{array} \end{aligned}$$

(3)

An $S^\alpha $-decomposable fuzzy measure, denoted by $\rho ^{\alpha }$, is a hybrid function between possibility and probability measures defined as follows:

Definition 1

(Hybrid $\pi $-p measure [3]). A hybrid possibility-probability measure $\rho ^\alpha $ is a fuzzy measure such that, for disjoint sets A and B:

$$\begin{aligned} \rho ^{\alpha }(A\cup B)&= S^{\alpha }(\rho ^{\alpha }(A),\rho ^{\alpha }(B)) \end{aligned}$$

(4)

The fuzzy measure $\rho ^{\alpha }$ is clearly a probability measure (rescaled on $[\alpha , 1]$) for likely events, and a possibility measure if one of the events is unlikely. The hybrid $\pi $-p measure is an example of level-dependent capacity [8]. For events that can be considered independent A, B, we have

$$\begin{aligned} \rho ^{\alpha }(A\cap B)=T^{\alpha }(\rho ^{\alpha }(A),\rho ^{\alpha }(B)) \end{aligned}$$

(5)

The limit condition $\rho ^{\alpha }(\varOmega ) = 1$ comes down to enforcing, for any event A and its complement $A^c$, the duality condition $\rho ^{\alpha }(A) + \rho ^{\alpha }(A^c) = 1 + \alpha $, when $\min (\rho ^{\alpha }(A),\rho ^{\alpha }(A^c)) >\alpha $ and $\max (\rho ^{\alpha }(A), \rho ^{\alpha }(A^c)) = 1$ otherwise.

Because it is decomposable, $\rho ^{\alpha }$ is completely defined by a distribution of weights $\rho ^{\alpha }$ on the singletons of the referential S. Applying (4) to the union of singletons yields the normalization condition suggested in [3]:

Proposition 1

A distribution $\rho ^\alpha $ on S defines a normalised $S^\alpha $-decomposable fuzzy measure if and only if

$$\begin{aligned} \sum _{s: \rho ^{\alpha }(s)>\alpha }\rho ^{\alpha }(s)=1+(card(C^+_\alpha ) -1)\times \alpha \end{aligned}$$

(6)

where $C^+_\alpha = \{s,\rho ^{\alpha }(s)>\alpha \}$.

Proof

(not given in [3]). Let $C^+_\alpha = \{s,\rho ^{\alpha }(s)>\alpha \}$. Then, by associativity and commutativity we have $\rho ^{\alpha }(S) = S^\alpha _{s}(\rho ^{\alpha }(s))= \max (\max _{s \not \in C^+_\alpha } \rho ^{\alpha }(s), S^\alpha _{s \in C^+_\alpha } (\rho ^{\alpha }(s))$, and the latter term is $S^\alpha _{s \in C^+_\alpha } (\rho ^{\alpha }(s)) = \sum _{s\in C^+_\alpha }\rho ^{\alpha }(s) - (card(C^+_\alpha )-1)\alpha >\alpha \ge \max _{s \not \in C^+_\alpha } \rho ^{\alpha }(s) $. Clearly $S^\alpha _{s}(\rho ^{\alpha }(s)) =S^\alpha _{s \in C^+_\alpha } (\rho ^{\alpha }(s)) = 1$ yields condition (6).

Note that $C^+_\alpha \ne \emptyset $. For otherwise, $\rho ^{\alpha }(S) < 1$. If $card(C^+_\alpha ) = 1$, then $\rho ^{\alpha }(S) = \max _{s}\rho ^{\alpha }(s) = 1$, i.e. $\rho ^{\alpha }$ is a possibility measure. If $card(C^+_\alpha ) = n$, then $\rho ^{\alpha }(S) = 1$ reads $\sum _{s}\rho ^{\alpha }(s) - (n-1)\alpha = 1$, hence $\sum _{s}^n\rho ^{\alpha }(s) = 1 + (n-1)\alpha $ and the fuzzy measure $\rho ^{\alpha }$ is additive.

Example 1

Consider the $S^{0.7}$-decomposable fuzzy measure $\rho ^{0.7}$ on the set $\{s_a,s_b,s_c,s_d\}$, defined by the distribution $\rho ^{0.7}(s_a)=0.5$ $\rho ^{0.7}(s_b)=0.5$ $\rho ^{0.7}(s_c)=0.8$ $\rho ^{0.7}(s_d)=0.9$. Here, $C^+_\alpha = \{s_c, s_d\}$. One can check that $\rho ^{0.7}(\varOmega ) = 0.8 + 0.9 - 0.7 = 1$. It is normalized.

More generally it is easy to express the value of $\rho ^\alpha $ on general events:

$$ \rho ^\alpha (A) ={\left\{ \begin{array}{ll} \mathop {\sum }\nolimits _{s \in A} \rho ^\alpha (s) - \alpha (card(A)- 1) \text { if } A \subseteq C^+_\alpha , \\ \mathop {\max }\nolimits _{s \in A} \rho ^\alpha (s) \text { if } A \subseteq \overline{C^+_\alpha }, \\ \mathop {\sum }\nolimits _{s \in A\cap C^+_\alpha } \rho ^\alpha (s) - \alpha (card(A\cap C^+_\alpha )- 1) \text { otherwise. }\end{array}\right. }$$

Note that the third case ($C^+_\alpha \cap A \ne \emptyset $) covers the first. We can moreover prove that any measure $\rho ^\alpha $ is a plausibility measure in the sense of Shafer [11] obtained as a probabilistic mixture between a possibility measure and a probability measure.

Proposition 2

For any hybrid possibility-probability function $\rho ^\alpha $, there exists a possibility measure $\varPi $ with possibility distribution $\pi $ and a probability measure P with distribution p such that $\rho ^\alpha (s) = \alpha \pi (s) + (1- \alpha )p(s)$ where $\forall s, \pi (s) < 1$ implies $p(s) = 0$. Moreover $\forall A, \rho ^\alpha (A) = \alpha \max _{s \in A}\pi (s) + (1- \alpha )\sum _{s \in A}p(s)$.

Proof

For events, if $A \cap C^+_\alpha = \emptyset $, $ \rho ^\alpha (A) = \alpha \max _{s \in A} \pi (w) = \alpha \varPi (A) + (1-\alpha )P(A)$ since $P(A)=0$. Otherwise, if $A \cap C^+_\alpha \ne \emptyset $:

$$\begin{aligned} \rho ^\alpha (A)&= \sum _{s \in A\cap C^+_\alpha } \rho ^\alpha (s) - \alpha (card(A\cap C^+_\alpha )- 1)\\&= \sum _{s \in A \cap C^+_\alpha } (\alpha (1-p(s)) + p(s) ) - \alpha (card(A\cap C^+_\alpha )- 1)\\&= \alpha card(A\cap C^+_\alpha ) - \alpha P(A\cap C^+_\alpha ) + P(A\cap C^+_\alpha ) - \alpha (card(A\cap C^+_\alpha )- 1) \\&= \alpha + (1-\alpha ) P(A\cap C^+_\alpha ) = \alpha \varPi (A) + (1-\alpha ) P(A) \end{aligned}$$

since $\varPi (A)=1$ and $p(w)= 0 \text { for } s \not \in C^+_\alpha . $ $\Box $

Note that the dual of $\rho ^\alpha $ ($\overline{\rho }^\alpha ( A) = 1- \rho ^\alpha (\overline{A})$) is a convex combination of a necessity measure and a probability measure. Indeed, we have $\overline{\rho }^\alpha ( A) = 1 - \alpha \varPi (\overline{A}) - (1-\alpha ) P(\overline{A}) = \alpha (1 - \varPi (\overline{A})) + (1- \alpha )( 1- P(\overline{A})) = \alpha N(A) + (1- \alpha )P(A) $. When the possibility distribution is vacuous ($C^+_\alpha = S$), it is a special case of Shafer discounting scheme [11], and is known in the imprecise probability literature as the linear-vacuous model [1]. In particular, $\rho ^\alpha = \alpha \pi + (1- \alpha )p$ is a normalized $\rho ^{\alpha }$ distribution.

Hybrid $\pi $-p distributions combine models of two extreme behaviors in uncertain contexts, namely when the knowledge can be expressed through a possibility distribution and when it is can be expressed through a probability distribution. Hybrid distributions behave like probabilities only on the states with maximum possibility. As seen above the hybrid model must satisfy the constraint $P(s)=0 \text { if } \pi (s)<1, \forall s$. It is clear that $C^+_\alpha $ is the core of $\pi $. Then from both distributions and a given threshold $\alpha $, we can build the hybrid one $\rho ^{\alpha }$ using a weighted average $\rho ^{\alpha }(s) = \alpha \pi (s) + (1- \alpha )p(s)$. The idea is that the decision-maker provides a standard probability distribution (e.g., frequencies) on normal states. Then she considers that there is a subjective probability $\alpha $ that the actual state has not been observed, and defines a possibility measure on the states of zero probability, an idea in agreement with the handling of zero probability events in De Finetti approach to subjective probability (see [2]).

4 Decision Making on Hybrid $\pi $-p Decision Trees

Back to our problematics of decision making under uncertainty, let us consider hybrid $\pi $-p simple lotteries $L = {<}\rho /\lambda _1,..., \rho _n / \lambda _n {>}$. We define two utility functionals $ES^{Opt}$ and $ES^{Pes}$ based on such hybrid $\pi $-p distributions:

$$\begin{aligned} ES^{Opt}(L)&=S_{i=1,\dots , n}^{\alpha }(T^{\alpha }(\rho ^{\alpha ,L}_i,\lambda _i))\end{aligned}$$

(7)

$$\begin{aligned} ES^{Pes}(L)&=1-S_{i=1, \dots , n}^{\alpha }(T^{\alpha }(\rho ^{\alpha ,L}_i,1-\lambda _i)) \end{aligned}$$

(8)

Considering a lottery $L^{\rho }= {<}\rho _1/\lambda _1,..., \rho _n / \lambda _n {>}$ we define a lottery $(1-L)^\rho $ by $(1-L)^{\rho }= {<}\rho _1/(1-\lambda _1),..., \rho _n /(1- \lambda _n) {>}$. We have the following semi-duality relation

$$\begin{aligned} ES^{Pes}(L^\rho ) = 1- ES^{Opt}((1-L)^\rho ). \end{aligned}$$

(9)

This property will be useful for some proofs in the following. For the sake of brevity, we will drop $\alpha $ from $S^{\alpha }$ and $T^{\alpha }$ in the sequel.

4.1 Hybrid Utility Functional and Decision Maker Behavior

Let us rewrite $ES^{Opt}(L)$ and $ ES^{Pes}(L)$ more explicitly so as to lay bare its meaning.

Proposition 3

$$\begin{aligned} ES^{Opt}(L)= {\left\{ \begin{array}{ll} U^{Opt}(L) \text { if } \not \exists i \text { s.t. } \lambda _i> \alpha \text { with } \rho ^{\alpha }_i>\alpha \\ E^{Opt}(L)=\alpha + \frac{\sum _{i|\lambda _i\rho ^{\alpha }_i>\alpha } (\lambda _i-\alpha ) (\rho ^{\alpha }_i-\alpha )}{1-\alpha } \text { otherwise} \end{array}\right. } \end{aligned}$$

(10)

$$\begin{aligned} ES^{Pes}(L)= {\left\{ \begin{array}{ll} U^{Pes}(L) \text { if } \not \exists i \text { s.t. } \lambda _i < 1-\alpha \text { with } \rho ^{\alpha }_i>\alpha \\ E^{Pes}(L)=1-\alpha - \frac{\sum _{i|1-\lambda _i,\rho ^{\alpha }_i>\alpha } (1-\lambda _i-\alpha ) (\rho ^{\alpha }_i-\alpha )}{1-\alpha } \text {otherwise} \end{array}\right. } \end{aligned}$$

(11)

Proof

If $\not \exists i$ s.t. $\lambda _i > \alpha $ with $\rho ^{\alpha }_i>\alpha $ then $\forall i$, $T =\min $ is used and $S= \max $ too. So we are back to the $U^{Opt}(L)$ criterion. If there is only one i s.t. $\lambda _i > \alpha $ with $\rho ^{\alpha }_i>\alpha $, then $ES^{Opt}(L)= T^{\alpha }(\rho ^{\alpha }_i,\lambda _i) = \alpha + \frac{ (\lambda _i-\alpha ) (\rho ^{\alpha }_i-\alpha )}{1-\alpha } $. For the general case where there are several i such that $\min (\rho ^{\alpha ,L}_i,\lambda _i)> \alpha $, say a set $I^+$ of indices then $T(\rho ^{\alpha ,L}_i,\lambda _i)> \alpha , i \in I^+$ only. Then $ES^{Opt}(L)= \sum _{i \in I^+}(\alpha + \frac{ (\lambda _i-\alpha ) (\rho ^{\alpha }_i-\alpha )}{1-\alpha }) - (card(I^+)-1)\alpha = \alpha + \sum _{i \in I^+}(\frac{ (\lambda _i-\alpha ) (\rho ^{\alpha }_i-\alpha )}{1-\alpha }) $. Using semi-duality between $ES^{Pes}$ and $ES^{Opt}$ we get the expression of the former. $\Box $

From Proposition 3 it is easy to check that:

Proposition 4

$ES^{Opt}(L)\le \alpha $ iff $ES^{Opt}(L)=U^{Opt}(L)$.

Likewise, $ES^{Pes}(L)\ge 1-\alpha $ iff $ES^{Pes}(L)=U^{Pes}(L)$.

In other words, the criterion $ES^{Opt}(L)$ is possibilistic optimistic ($= U^{Opt}(L)$) so long as entries (utilities or plausibilities) are below the threshold $\alpha $ (distribution included in blue area on Fig. 3). Otherwise, we get an expected value over states with plausibilities and utilities greater than $\alpha $ (see green area in Fig. 3, left). Likewise, with $ES^{Pes}(L)$, we get an expected value over states with utility less than $1-\alpha $ and with high enough plausibility i.e. greater than $\alpha $. We get the pessimistic possibilistic criterion $U^{Pes}(L)$ otherwise (with either high utilities or low plausibilities); see green area in Fig. 3 right side.

Example 2

let $D_1$, $D_2$ and $D_3$ be decisions with $\rho ^{0.70}$ distribution on $\{a,b,c,d\}$ with $\lambda _a=0.2$, $\lambda _b=0.6$, $\lambda _c=0.8$, $\lambda _d=1$, $D_1={<}0.7/\lambda _a,$ $0.9/\lambda _b,0.6/\lambda _c,$ $0.5/\lambda _d{>}$, $D_2={<}0.75/\lambda _a,$ $0.90/\lambda _b,$ $0.75/\lambda _c,0.5/\lambda _d{>}$ and $D_3={<}0.75/\lambda _a,$ $0.85/\lambda _b,$ $0.75/\lambda _c,0.75/\lambda _d{>}$. If the DM is optimistic, we can see that the $D_1$ is in the possibility area since $\rho ^{0.7}_{D_1}(c)$ and $\rho ^{0.7}_{D_1}(c)$ are $\le 0.7$ while $D_2$ and $D_3$ are in EU area $ES^{Opt}$. We have $D_3\succ D_2\succ D_1$. Note that $D_3$ is preferred to $D_2$ since the expected utility to be in the green square Fig. 3 is greater. If the DM is pessimistic, $D_1$ is in possibility area while $D_2$ and $D_3$ are in EU area since $\rho ^{0.7}_{D_2}(a)=\rho ^{0.7}_{D_2}(a)<1-\lambda $ and the preference relation is $D_1\succ D_2\sim D_3$.

4.2 Decision Trees with Hybrid $\pi $-p Distributions

Consider now decisions tree. We now know how to compare simple lotteries. In order to compare strategies, i.e. compound lotteries, we define a principle of reduction of hybrid $\pi $-p compound lotteries:

Definition 8

($Reduction^{\rho }$). For any compound lottery of the form $L={<} \rho ^{\alpha }_1/L^{\rho ^{\alpha }}_1, \cdots , \rho ^{\alpha }_m/ L^{\rho ^{\alpha }}_m{>}$, $Red^{\rho }(L)$ is the simple lottery that associates to each $\lambda _i$ the weight $\forall \lambda _i, \overline{\rho }^{\alpha }_i= S_{j=1}^m T( \rho ^{\alpha }_j, \rho ^{\alpha ,j}_i)$, where $\rho ^{\alpha ,j}_i$ denotes the confidence value of getting $\lambda _i$ though lottery $L^{\rho ^\alpha }_j $ and $\rho ^{\alpha }_j$ the confidence value of getting $L^{\rho ^\alpha }_j$.

It is easy to get the following result:

Proposition 5

$$\begin{aligned} \overline{\rho }^{\alpha }_i= {\left\{ \begin{array}{ll} \mathop {\max }\nolimits _{j=1,...,m} \min (\rho ^{\alpha }_j, \rho ^{\alpha ,j}_i) \text { if } \not \exists j \text { s.t. } \rho ^{\alpha }_j> \alpha \text {, } \rho ^{\alpha ,j}_i>\alpha \\ \alpha + \frac{\sum _{j|\rho ^{\alpha }_j,\rho ^{\alpha ,i}_j>\alpha } (\rho ^{\alpha }_j-\alpha ) (\rho ^{\alpha ,i}_j-\alpha )}{1-\alpha } \text { otherwise} \end{array}\right. } \end{aligned}$$

(12)

The following proposition states the main result of this paper:

Proposition 6

Given a decision tree with $\rho ^{\alpha }$ measure, the hybrid criteria $ES^{Opt}$ and $ES^{Pes}$ satisfy the weak monotonicity property.

Proof

We propose a proof for $ES^{Opt}$, the proof is similar for $ES^{Pes}$. For each lottery L, we have two cases: (a) $\not \exists i$ such that $\rho _i>\alpha $ and $\lambda _i>\alpha $ and (b) $\exists i$ such that $\rho _i>\alpha $ and $\lambda _i>\alpha $. Let us explore all possible configurations.

1.
L, $L'$ and $L''$ are in (a): $\forall i$ the t-norm $\min $ and t-conorm $\max $ are used so we are in possibility case and the weak monotonicity property holds.
2.
L and $L'$ in (a) and $L''$ in (b), we need to distinguish two cases:
1. i)
  if $v\le \alpha $ then we are again in the case with t-norm $\min $ and t-conorm $\max $ so the weak monotonicity property holds.
2. ii)
  if $v>\alpha $ then ${<}u/L,v/L''{>}$ and ${<}u/L',v/L''{>}$ are in (b) so $ES^{Opt}({<}u/L,v/L''{>})=ES^{Opt}({<}u/L',v/L''{>})$ so the weak monotonicity is satisfied.
3.
L, $L'$ and $L''$ in (b), we need to distinguish three cases:
1. i)
  if $v\le \alpha $ and $u>\alpha $ then ${<}u/L,v/L''{>}$ and ${<}u/L',v/L''{>}$ are in (b) so $ES^{Opt}({<}u/L,v/L''{>})=\alpha + \frac{\sum _{i|\lambda _i,T^P(\rho ^{\alpha ,L}_i,u)>\alpha } (\lambda _i-\alpha ) \times (T^P(\rho ^{\alpha ,L}_i,u)-\alpha )}{1-\alpha }=\alpha +(u-\alpha )\frac{\sum _{i|\lambda _i,\rho ^{\alpha ,L}_i>\alpha } (\lambda _i-\alpha ) \times (\rho ^{\alpha ,L}_i-\alpha )}{(1-\alpha )^2}\le ES^{Opt}({<}u/L',v/L''{>})=\alpha +(u-\alpha )\frac{\sum _{i|\lambda _i,\rho ^{\alpha ,L'}_i>\alpha } (\lambda _i-\alpha ) \times (\rho ^{\alpha ,L'}_i-\alpha )}{(1-\alpha )^2}$ so the weak monotonicity is satisfied.
2. ii)
  if $v>\alpha $ and $u\le \alpha $ we are in a similar situation as in 2) ii)
3. iii)
  if $v>\alpha $ and $u>\alpha $ then we have
  
  $ES^{Opt}({<}u/L,v/L''{>})=(u-\alpha )\frac{ES^{Opt}(L)}{(1-\alpha )}+(v-\alpha )\frac{ES^{Opt}(L'')}{(1-\alpha )}-\alpha \le ES^{Opt}({<}u/L',v/L''{>})=(u-\alpha )\frac{ES^{Opt}(L')}{(1-\alpha )}+(v-\alpha )\frac{ES^{Opt}(L'')}{(1-\alpha )}-\alpha $
4.
$L''$ in (a) and L, $L'$ in (b), we need to distinguish two cases: if $u>\alpha $ the proof is similar to the one in 3) i) and if $u\le \alpha $ similar to the one in 2) i).
5.
L, $L''$ in (a) and $L'$ in (b), we need to distinguish two cases:
1. i)
  if $u>\alpha $, $ES^{Opt}({<}u/L,v/L''{>})\le \alpha $ and $ES^{Opt}({<}u/L',v/L''{>})> \alpha $.
2. ii)
  if $u\le \alpha $ then $ES^{Opt}({<}u/L,v/L''{>})\le ES^{Opt}({<}u/L',v/L''{>})$ the proof is similar to the one in 2) i) since $u/L'$ is in (a).
6.
L in (a) and $L'$, $L''$ in (b), we need to distinguish three cases:
1. i)
  if $u\le \alpha $ and $v>\alpha $ then the proof is similar to the one in 2) ii).
2. ii)
  if $u> \alpha $ and $v\le \alpha $ then ${<}u/L,v/L''{>}$ in (a) and ${<}u/L,v/L''{>}$ in (b) from Proposition 4 the weak monotonicity property holds.
3. iii)
  if $u> \alpha $ and $v> \alpha $ then $ES^{Opt}({<}u/L,v/L''{>})=(v-\alpha )\frac{ES^{Opt}(L'')}{(1-\alpha )}\le ES^{Opt}({<}u/L',v/L''{>})=(u-\alpha )\frac{ES^{Opt}(L')}{(1-\alpha )}+(v-\alpha )\frac{ES^{Opt}(L'')}{(1-\alpha )}-\alpha $ so the weak monotonicity property holds. $\Box $

From Proposition 6, when the decision maker provides a $\rho $-style decision tree, $ES^{Opt}$ and $ES^{Pes}$ satisfy the three basic properties required in the introduction: consequentialism, dynamic consistency and lottery reduction.

4.3 Composing Possibilistic and Probabilistic Lotteries

According to Proposition 2, a (compound) lottery can be viewed as two (compound) lotteries, a possibilistic one $L^\pi = {<}\pi _1 / L^{\pi }_1, \dots , \pi ^{\pi }_m / L_m{>}$, and a probabilistic one $ L^p = {<}p_1 / L^{p}_1, \dots , p_m / L^{p}_m{>}$, both on the same decision tree. Assume that $\pi _i < 1$ implies $p_i= 0, i = 1, \dots m$. We are interested in merging them into a hybrid lottery, given the parameter $\alpha $. Proposition 2 leads to define a fusion operation:

$$\begin{aligned} F(\pi ,p,\alpha )= {\left\{ \begin{array}{ll} \rho ^{\alpha }_i= \alpha \pi _i \text { if } \pi _i< 1,\\ \rho ^{\alpha }_i=\alpha +(1-\alpha ) p_i \text { otherwise.} \end{array}\right. } \end{aligned}$$

(13)

Merging the possibility and probability distributions locally should be equivalent to merging the reduced lotteries at the global level. Moreover, the fusion operation must be distributive over the reduction operator.

Property 1

(Distributivity over reduction). Let $L= (L^{\pi }, L^{p})$ be a pair of possibilistic and probabilistic compound lotteries on the same decision tree. Operator F is said to satisfy the distributivity property iff $F(Red^{\pi }(L^{\pi }),Red^{p}(L^{p}),\alpha )= Red^{\rho }({<}F(\pi _1,p_1,\alpha )/ F(L^{\pi }_1,L_1^{p},\alpha ), \dots , F(\pi _m,p_m,\alpha ) / F(L^{\pi }_m,L_m^{p},\alpha ){>})$.

Proposition 7

$F(\pi ,p,\alpha )$, defined by Eq. (13), satisfies Property 1.

Proof

When applying reduction to the probability tree followed by F, we obtain $\rho _i=\alpha +(1-\alpha )\sum _j p_j\times p^i_j,\forall i$ s.t. $\exists j$ with $p^i_j> 0,p_j>0$ and $\rho _i=\alpha \max _j \min (\pi _j, \pi ^i_j)$ otherwise. When applying F first: if $\exists j$ with $p^i_j>,p_j>0$ then $\rho ^{\alpha ,i}_j>\alpha ,\rho ^{\alpha }_j>\alpha $. From Proposition 5, the $\rho ^{\alpha }$-reduction is $\overline{\rho }^{\alpha }_i=\alpha +(1-\alpha )\sum _j p_j\times p^i_j$. Otherwise, the $\rho ^{\alpha }$-reduction is $\max _{j=1,...,n}\min (\rho ^{\alpha ,i}_{j},\rho ^{\alpha }_{j})$ with $\min (\rho ^{\alpha ,i}_{j},\rho ^{\alpha }_{j})= \alpha \pi ^i_j$ or $\alpha \pi _j$ so we obtain $\rho _i=\alpha \max _j \min (\pi _j, \pi ^i_j)$. $\Box $

This result ensures the dynamic consistency of the hybrid $\pi $-p approach to sequential decision-making under uncertainty.

5 Conclusion

In this paper, we try to improve the range of decision trees that can be solved by dynamic programming and respect consequentialism as well as dynamic consistency, beyond standard probabilistic decision trees, and possibilistic ones. We have shown that everything relies on i) defining the uncertainty measure by means of a generalized weight distribution on utility values; 2) the possibility of reducing compound lotteries into simple ones; 3) the definition of a utility functional by means of a generalized integral. This paper proposes a solution to this problem, and shows that it leads to a very restricted family of decomposable measures with respect to a specific family of t-conorms, due to the conditional distributivity property required to ensure lottery reduction. The paper proves that the obtained utility functionals satisfy the weak monotonicity property, which ensures computability of optimal decisions via dynamic programming. The kind of uncertainty function laid bare in this study turns out to be a Shafer plausibility (resp. belief) function obtained as a convex mixture of probability and possibility (resp.necessity) functions, which opens the way to a natural interpretation of these uncertainty measures.

Notes

1.
Given from a possibility distribution $\pi $ over a set S, the possibility and the necessity of any event $A \subseteq S$ are defined by $\varPi (A) = \max _{s \in A} \pi (s), N(A) = 1 - \varPi (\bar{A}) = 1 - \max _{s \notin A} \pi (s)$.
2.
A t-conorm is a non-decreasing semi-group operation on [0, 1] with identity 0 and absorbing element 1. A t-norm is a non-decreasing semi-group operation on [0, 1] with identity 1 and absorbing element 0. T-norms and t-conorms are gradual models of conjunction and disjunction. See [9] for more details.

References

Augustin, T., Coolen, F., De Cooman, G., Troffaes, M.: Introduction to Imprecise Probabilities. Wiley, Hoboken (2014)
Book Google Scholar
Coletti, G., Scozzafava, R.: Probabilistic Logic in a Coherent Setting. Kluwer Academic Publishers, Dordrecht (2002)
Book Google Scholar
Dubois, D., Pap, E., Prade, H.: Hybrid probabilistic-possibilistic mixtures and utility functions. In: Fodor, J., De Baets, B., Perny, P. (eds.) Preferences and Decisions Under Incomplete Knowledge. Studies in Fuzziness and Soft Computing, vol. 51, pp. 51–73. Springer, Heidelberg (2000). https://doi.org/10.1007/978-3-7908-1848-2_4
Chapter MATH Google Scholar
Dubois, D., Prade, H.: A class of fuzzy measures based on triangular norms. Int. J. Gen Syst 8(1), 43–61 (1982)
Article Google Scholar
Dubois, D., Prade, H.: Possibility theory as a basis for qualitative decision theory. In: Proceedings of the IJCAI, vol. 95, pp. 1924–1930 (1995)
Google Scholar
Dubois, D., Rico, A.: New axiomatisations of discrete quantitative and qualitative possibilistic integrals. Fuzzy Sets Syst. 343, 3–19 (2018)
Article MathSciNet Google Scholar
Fargier, H., Guillaume, R.: Sequential decision making under ordinal uncertainty: a qualitative alternative to the Hurwicz criterion. Int. J. of Approx. Reason. 116, 1–18 (2020)
Article MathSciNet Google Scholar
Greco, S., Matarazzo, B., Giove, S.: The Choquet integral with respect to a level dependent capacity. Fuzzy Sets Syst. 175(1), 1–35 (2011)
Article MathSciNet Google Scholar
Klement, E.P., Mesiar, R., Pap, E.: Triangular Norms. Kluwer Academic, Dordrecht (2000)
Book Google Scholar
Raiffa, H.: Decision Analysis: Introductory Lectures on Choices Under Uncertainty. Addison-Wesley, Reading (1968)
MATH Google Scholar
Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
Book Google Scholar
von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1944)
MATH Google Scholar
Yager, R.R., Rybalov, A.: Uninorm aggregation operators. Fuzzy Sets Syst. 80(1), 111–120 (1996)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

IRIT - CNRS, Université de Toulouse, Toulouse, France
Didier Dubois, Hélène Fargier & Romain Guillaume
Eric, Université de Lyon, Lyon, France
Agnès Rico

Authors

Didier Dubois
View author publications
You can also search for this author in PubMed Google Scholar
Hélène Fargier
View author publications
You can also search for this author in PubMed Google Scholar
Romain Guillaume
View author publications
You can also search for this author in PubMed Google Scholar
Agnès Rico
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Romain Guillaume .

Editor information

Editors and Affiliations

Umeå University, Umeå, Sweden
Vicenç Torra
Tamagawa University, Tokyo, Japan
Yasuo Narukawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dubois, D., Fargier, H., Guillaume, R., Rico, A. (2021). Sequential Decision-Making Under Uncertainty Using Hybrid Probability-Possibility Functions. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-85529-1_5
Published: 20 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85528-4
Online ISBN: 978-3-030-85529-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sequential Decision-Making Under Uncertainty Using Hybrid Probability-Possibility Functions

Abstract

Similar content being viewed by others

Algorithms for Multi-criteria Optimization in Possibilistic Decision Trees

Possibilistic Graphical Models: From Reasoning to Decision Making

Efficient Policies for Stationary Possibilistic Markov Decision Processes

Keywords

1 Introduction

2 Decision Trees

Definition 1

Definition 2

Definition 3

3 Hybrid Possibility-Probability Measures

Definition 4

Definition 5

Definition 6

Definition 7

Definition 1

Proposition 1

Proof

Example 1

Proposition 2

Proof

4 Decision Making on Hybrid \(\pi \)-p Decision Trees

4.1 Hybrid Utility Functional and Decision Maker Behavior

Proposition 3

Proof

Proposition 4

Example 2

4.2 Decision Trees with Hybrid \(\pi \)-p Distributions

Definition 8

Proposition 5

Proposition 6

Proof

4.3 Composing Possibilistic and Probabilistic Lotteries

Property 1

Proposition 7

Proof

5 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation