Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

2.1 Introduction

Multiple criteria decision aid (MCDA) is involved with supporting the structuring and modeling of decision problems involving multiple conflicting criteria. Similarly to other operations research/management science approaches, MCDA methods are also based on modeling assumptions, related to the characteristics of the problem, the aggregation of the decision criteria, and the preferential system of the decision maker (DM). Naturally, these assumptions incorporate uncertainties, fuzziness, and errors, thus affecting the quality of the obtained recommendations. Thus, changes in the decision context and the available data may lead to completely different outputs.

In this framework, robustness analysis has emerged as a major research issue in MCDA, emphasizing the need to re-think the traditional multicriteria framework aiming towards providing satisfactory recommendations even in cases where the decision context is altered. Roy [21] described in detail the robustness concern, arguing that it is raised by vague approximations and zones of ignorance that cause the formal representation of a problem to diverge from the real-life context, due to: (1) the way imperfect knowledge is treated, (2) the inappropriate preferential interpretation of certain types of data (e.g., transformations of qualitative attributes), (3) the use of modeling parameters to grasp complex aspects of reality, and (4) the introduction of technical parameters with no concrete meaning.

MCDA provides a wide arsenal of methodologies and techniques that enable the systematic treatment of decision problems under multiple criteria. In this chapter we focus on the preference disaggregation approach (PDA), which is involved with the inference of preferential information and decision models from data [15]. PDA techniques can greatly facilitate the model construction process, reducing the cognitive effort required by DMs when specifying complex preferential information and modeling parameters.

Robustness analysis in the framework of PDA is based on analytic and simulation techniques (for an overview see [7]). This chapter considers the former approach, which is based on two main schemes. The first focuses on the construction of a single decision model that best represents the available decision instances [5, 13], whereas the second is involved with the formulation of a range of recommendations on the basis of all models compatible with the given data [10, 12]. In this chapter we re-analyze the robustness of such approaches and introduce new robustness metrics following a data-driven perspective. More specifically, we are concerned with robustness issues in terms of variations in the data instances used to infer a decision model. A similar view of robustness is very common on other fields also involved with model inference from data (e.g., statistical learning [6]), but its analytic treatment in the context of MCDA has been limited so far, despite the existence of experimental results supporting its significance [8, 24]. This chapter contributes in that direction and proposes tools based on well-known concepts from optimization theory. The analysis is focused on decision models expressed in the form of additive value functions for classification (sorting) problems , which involve the assignment of a finite set of alternative options into predefined performance categories [27]. For the purposes of the presentation an illustrative example is used.

The rest of the chapter is organized as four sections. Section 2.2 introduces the framework of preference disaggregation analysis for classification problems and presents the main existing robustness analysis techniques and approaches from the MCDA literature. Section 2.3 discusses the importance of the proposed data-driven framework for robustness analysis in disaggregation techniques and introduces new robustness indicators constructed on the basis of this framework. Section 2.4 presents results from the application on an example data set and finally Sect. 2.5 concludes the chapter and discusses some future research directions.

2.2 Preference Disaggregation for Multicriteria Classification

2.2.1 General Framework

Multicriteria problems involve multi-objective optimization and discrete evaluation cases. In this chapter we are concerned with the latter type, which is about the evaluation of a set X of discrete alternatives over n performance criteria. The result of the evaluation may be expressed in different forms, such as a choice, ranking, and classification. The present study focuses on classification problems, where the alternatives under consideration should be classified into q rank-ordered performance categories \(C_{1} \succ C_{2} \succ \cdots \succ C_{q}\). Category C 1 is assumed to consist of the best alternatives whereas C q consists of the worst performing ones.

In this context, a decision model \(F(\mathbf{x},\beta ) \rightarrow \{ C_{1},\ldots,C_{q}\}\) aggregates the available information about the criteria and provides recommendations about the classification of the alternatives. The model is explicitly defined by the parameters \(\beta\), which may relate to the relative importance of the criteria or other information about the aggregation process.

In the field of MCDA there is a wide range of different types of decision and evaluation models. Some common examples include value functions [17], outranking models [20, 25], and decision rules [9]. Bouyssou et al. [2] provide a comprehensive overview of different MCDA models and their characterization.

For the reminder of the presentation this chapter will focus on additive value function (AVF) models, which have been widely used in MCDA. The general form of an AVF is:

$$\displaystyle{ V (\mathbf{x}_{i}) =\sum _{ k=1}^{n}w_{ k}v_{k}(x_{ik}) }$$
(2.1)

where x i  = (x i1, , x in ) is the data vector for alternative i (x ij being the data of i on criterion j), w 1, , w n  ≥ 0 are trade-off constants (normalized to sum up to one) representing the relative importance of the criteria, and v 1(⋅ ), , v n (⋅ ) are the marginal value functions of the criteria. The marginal value functions decompose the overall performance V (x i ) of each alternative i into partial assessments at the criteria level, each usually scaled between 0 and 1.

The most straightforward approach to use a value function model to classify an alternative into predefined rank-ordered classes, is to employ the following decision rule:

$$\displaystyle{ t_{\ell} < V (\mathbf{x}_{i}) < t_{\ell-1} \Leftrightarrow \mathbf{x}_{i} \in C_{\ell} }$$
(2.2)

where \(t_{0} = 1 > t_{1} > t_{2} > \cdots > t_{q-1} > t_{q} = 0\) are thresholds that distinguish the classes. Alternative classification rules can also be employed such as the example-based approach of Greco et al. [12] or the hierarchical model of Zopounidis and Doumpos [26].

In the framework of PDA, the parameters of the model are inferred from a sample of m decision instances \(X^{{\prime}} =\{ \mathbf{x}_{i},y_{i}\}_{i=1}^{m}\), where y i denotes the given class label for alternative i. This sample (referred to as the reference set) may consist of decisions about alternatives considered in past situations or decisions about a set of alternatives which can be easily judged by the DM [15].

Formally, the model that is most compatible with the information in the reference set is defined by parameters \(\widehat{\beta }^{{\ast}}\) such that:

$$\displaystyle{ \widehat{\beta }^{{\ast}} =\arg \min _{\widehat{\beta } \in \mathcal{A}}L[Y _{X^{{\prime}}},F(X^{{\prime}},\widehat{\beta })] }$$
(2.3)

where \(F(X^{{\prime}},\widehat{\beta })\) denotes the outputs of a model with parameters \(\widehat{\beta }\) for the alternatives in \(X^{{\prime}}\), \(\mathcal{A}\) is the set of acceptable parameter values, and L(⋅ ) is a function that measures the differences between the recommendations of the model and the actual assessments \(Y _{X^{{\prime}}}\) for the reference alternatives. If the solution of the above problem (2.3) is judged satisfactory, then the inferred parameters \(\widehat{\beta }^{{\ast}}\) can be used to extrapolate the model to any other alternative outside the reference set.

For a value function model, problem (2.3) is expressed in a mathematical programming form. In particular, the inference of a classification model (weights of the criteria, marginal value functions, and classification thresholds) from the reference examples can be expressed as the following optimization problem:

$$\displaystyle{ \mbox{ min}\qquad \sum _{\ell=1}^{q} \frac{1} {m_{\ell}}\sum _{\mathbf{x}_{i}\in C_{\ell}}(\sigma _{i}^{+} +\sigma _{ i}^{-}) }$$
(2.4)
$$\displaystyle{ \mbox{ s.t.}\qquad V (\mathbf{x}_{i}) +\sigma _{ i}^{+} \geq t_{\ell} +\delta \qquad \forall \,\mathbf{x}_{ i} \in C_{\ell},\,\ell= 1,\ldots,q - 1 }$$
(2.5)
$$\displaystyle{ V (\mathbf{x}_{i}) -\sigma _{i}^{-}\leq t_{\ell} -\delta \qquad \forall \,\mathbf{x}_{ i} \in C_{\ell},\,\ell= 2,\ldots,q }$$
(2.6)
$$\displaystyle{ t_{\ell} - t_{\ell+1} \geq \varepsilon \qquad \ell = 1,\ldots,q - 2 }$$
(2.7)
$$\displaystyle{ V (\mathbf{x}_{{\ast}}) = 0,\,\,V (\mathbf{x}^{{\ast}}) = 1 }$$
(2.8)
$$\displaystyle{ V (\mathbf{x}) \geq V (\mathbf{x}^{{\prime}})\qquad \forall \,\mathbf{x} \geq \mathbf{x}^{{\prime}} }$$
(2.9)
$$\displaystyle{ \sigma _{i}^{+},\,\sigma _{ i}^{-}\geq 0\qquad i = 1,\ldots,m }$$
(2.10)

The objective function minimizes the total weighted classification error, where the weights are defined on the basis of the number of reference alternatives from each class (m 1, , m q ). The error variables \(\sigma ^{+}\) and \(\sigma ^{-}\) are defined through constraints (2.5)–(2.6) as the magnitude of the violations of the classification rules (2.2) (δ is a small positive constant used to ensure the string inequalities), whereas constraint (2.7) ensures that the class thresholds are defined in a decreasing sequence (\(\varepsilon\) is a small positive constant). Constraint (2.8) defines the scale of the additive model between 0 and 1 (0 corresponds to the performance of the least preferred alternative x and 1 corresponds to the performance of an ideal action x ). Finally, constraint (2.9) ensures that the model is non-decreasing with respect to the performance criteria (assuming all criteria are in maximization form).

For the case of an AVF, the above optimization problems can be written in linear programming form with a piece-wise linear modeling of the marginal values function (for the modeling details, see [4, 14]).

2.2.2 Robust Approaches

The robustness concern in the context of PDA arises because often alternative decision models can be inferred in accordance with the information embodied in the set of reference decision examples that a DM provides (i.e., the optimization model (2.4)–(2.10) often has multiple optimal solutions). This is particularly true for reference sets that do not contain inconsistencies, but it is also relevant when inconsistencies exist (in the PDA context, inconsistencies are usually resolved algorithmically or interactively with the DM before the final model is built; see for instance [19]).

With a consistent reference set the error variables can be removed from formulation (2.4)–(2.10), which then reduces to a set of feasible linear constraints defining all acceptable models that are compatible with the assignment of the reference alternatives.

$$\displaystyle{ \begin{array}{rl} &V (\mathbf{x}_{i}) \geq t_{\ell} +\delta \qquad \quad \forall \,\mathbf{x}_{i} \in G_{\ell},\,\ell= 1,\ldots,q - 1 \\ &V (\mathbf{x}_{i}) \leq t_{\ell} -\delta \qquad \quad \forall \,\mathbf{x}_{i} \in G_{\ell},\,\ell= 2,\ldots,q \\ &t_{\ell} - t_{\ell+1} \geq \varepsilon \qquad \quad \,\,\,\,\,\ell = 1,\ldots,q - 2 \\ &V (\mathbf{x}_{{\ast}}) = 0,\,\,V (\mathbf{x}^{{\ast}}) = 1 \\ &V (\mathbf{x}) \geq V (\mathbf{x}^{{\prime}})\qquad \quad \,\,\forall \,\mathbf{x} \geq \mathbf{x}^{{\prime}} \end{array} }$$
(2.11)

The size of the polyhedron defined through (2.11) is associated with the robustness of the results and can be affected by a number of factors. The most important of these factors relate to the adequacy of the set of reference examples and the complexity of the selected decision modeling form. The former is immediately related to the quality of the information on which model inference is based. Vetschera et al. [24] performed an experimental analysis to investigate how the size of the reference set affects the robustness and accuracy of the resulting multicriteria models in classification problems. They found that small reference sets (e.g., with a limited number of alternatives with respect to the number of criteria) lead to decision models that are neither robustness nor accurate. Expect for its size, other characteristics of the reference set are also relevant, such as the existence of noisy data, outliers, the existence of correlated criteria, etc. [4].

Traditional disaggregation techniques such as the family of the UTA methods use linear programming post-optimality techniques [22] in order to build a representative AVF defined as the average solution of some characteristic extreme points of the feasible polyhedron (2.11). Other approaches for selecting the most representative decision model include the regularization approach of Doumpos and Zopounidis [5], the analytic center formulation of Bous et al. [1], and the max-min model of Greco et al. [13]. As explained by Doumpos et al. [8] such approaches seek to identify (analytically) central solutions to the polyhedron defined by (2.11), which are expected to be more robust to changes in the data and the setting of the analysis.

Recently, alternative approaches have been proposed that enable the formulation of recommendations based on multiple decision models. Two main schemes can be identified in this framework. The first is based on simulation techniques, which are based on sampling, at random, different solutions (value functions) from the polyhedron defined by (2.11). The simulation process provides an approximate description of all models compatible with the classifications for the reference set and enables the formulation of a range of recommendations associated with probabilistic measures of confidence (see, for instance, [23]).

The second scheme, on which this study is focused, is based on approaches that seek to characterize the full set of acceptable models through analytic techniques, rather than using simulation. In particular, Greco et al. [12] introduced a modeling framework that takes into account all decision models (AVFs) compatible with the constraints (2.11). Their approach is based on the definition of necessary and possible assignments. The set of necessary assignments \(\mathcal{N}_{j}\) for a non-reference alternative \(j\not\in X^{{\prime}}\) consists of the classes in which j is classified by all models compatible with the reference set, whereas the set of possible assignments \(\mathcal{P}_{j}\) includes the results supported by at least one decision model. Obviously, \(\mathcal{N}_{j} \subseteq \mathcal{P}_{j}\). Furthermore, it should be noted that these definitions cover the general case where the reference alternatives might be classified in multiple classes (rather than the specific case described above where each alternative is assigned into only one class, in which case \(\mathcal{N}_{j}\) is either empty or singleton).

Figure 2.1 provides a graphical illustration of the necessary and possible assignments for a two-class problem, assuming a linear decision model (linear value function). With the given reference set consisting of alternatives classified in two categories (circles and rectangles), it is evident that all models that separate the two classes assign the non-reference alternative x 1 into class C 1. On the other hand, the precise classification of the non-reference action x 2 is not possible. In fact, this alternative can be assigned to any of the two categories.

Fig. 2.1
figure 1

An illustration of possible and necessary assignments

The necessary and possible assignments for a non-reference alternative j can be obtained through linear programming [12, 16]. In particular, a class C belongs to the set of possible assignments for a non-reference alternative j if the optimal objective value of the following linear program is strictly positive:

$$\displaystyle{ \begin{array}{rl} &\max \quad \gamma \\ &\mbox{ s.t.}\quad \,\,\,\,\,t_{\ell}+\gamma \leq V (\mathbf{x}_{j}) \leq t_{\ell+1}-\gamma \\ &\quad \qquad \mbox{ constraints }(\mbox{ 2.11})\mbox{ for }X^{{\prime}}\end{array} }$$
(2.12)

Similarly, a class C belongs to the set of necessary assignments for alternative j if either of the following two linear programs has a non-positive optimal objective function value:

$$\displaystyle{ \begin{array}{rl} &\max \quad \gamma \qquad \qquad \qquad \qquad \qquad \qquad \qquad \max \quad \gamma \\ &\mbox{ s.t.}\quad \,\,\,\,\,V (\mathbf{x}_{j}) \geq t_{\ell-1} +\gamma \qquad \quad \qquad \qquad \mbox{ s.t.}\quad \,\,\,\,\,V (\mathbf{x}_{j}) \leq t_{\ell}-\gamma \\ &\qquad \quad \,\,\mbox{ constraints }(\mbox{ 2.11})\mbox{ for }X^{{\prime}}\qquad \qquad \qquad \quad \mbox{ constraints }(\mbox{ 2.11})\mbox{ for }X^{{\prime}} \end{array} }$$
(2.13)

If γ ≤ 0 in the optimal solution of the left problem, then j cannot be assigned to any of the classes in the set {C 1, , C −1}, which implies that \(C_{\ell} \in \mathcal{N}_{j}\). On the other hand, if the optimal solution of the right problem yields γ ≤ 0, then j cannot be assigned to any of the classes in the set {C +1, , C q }, which again implies that \(C_{\ell} \in \mathcal{N}_{j}\).

It follows that, for every non-reference alternative j, the obtained possible assignments define a range [L j , U j ] with the worst and best possible ratings that can be defined on the basis of the information available in the evaluations of the reference actions.

The identification of the necessary and possible assignments provides valuable additional information as opposed to simple point recommendations obtained from a single decision model, thus enhancing the robustness of the results. However, given that the necessary and possible assignments are data-driven results (i.e., they are obtained from a specific reference set), it is apparent that they are also subject to the robustness concern. Figure 2.2 provides an illustration of this issue. According to the given two-class reference set (circles and rectangles), the indicated non-reference alternative is necessarily assigned to class C 2 by all linear value functions compatible with the available reference evaluations. This result, however, is not robust because a reconsideration of the evaluations for the two circled reference alternatives will lead to a different outcome.

Fig. 2.2
figure 2

An example of a necessary assignment that is not robust

Kadziński and Tervonen [16] proposed the combination of robust analytic procedures based on the specification of the necessary and analytic assignments with simulation techniques. The latter provide further information in probabilistic form about the necessary and possible assignments. Simulation-based methods, however, only provide an approximate description of the problem data and they can be computationally intensive for larger data sets involving many alternatives and criteria.

In the next section we present new ways and metrics to gain further insight into the robustness of necessary and robust assignments, without requiring the use of simulation. The proposed approaches adopt a data-driven perspective, in the sense that they are based on the properties of the available reference set. Their implementation is grounded on well-known techniques from optimization theory.

2.3 Data-Driven Robustness Indicators for Multicriteria Classification Problems

Motivated by the above discussion about the robustness concern for classification recommendations formulated using a set of decision models, this section presents simple techniques that can be used to gain a better understanding of the robustness issue in relation to the problem data, as represented in a set of reference assessments. The main idea is based on the analysis of the changes in the feasible polyhedron (2.11) due to the incorporation of the necessary/possible assignments to a given reference set.

To this end, first a simple support measure can be defined. Assume that according to a given reference set \(X^{{\prime}}\), a non-reference alternative j can be assigned to any of the classes in the interval [L j , U j ]. Then, the support measure S j is defined as the minimum number of changes that need to be made in the assignments of the reference actions in order to allow the classification of j into classes outside [L j , U j ]. The lower this support measure is, the less robust in the obtained interval assignment [L j , U j ], because minor changes in the reference set will lead to different conclusions.

The computation of support can be done in a straightforward manner through the solution of the following two mixed-integer linear programming problems:

$$\displaystyle{ \begin{array}{rl} &\min \quad \sum _{i=1}^{m}(\sigma _{i}^{+} +\sigma _{ i}^{-})\qquad \qquad \qquad \min \quad \sum _{i=1}^{m}(\sigma _{i}^{+} +\sigma _{ i}^{-}) \\ &\mbox{ s.t.}\quad V (\mathbf{x}_{j}) \geq t_{L_{j}-1} +\delta \qquad \qquad \qquad \mbox{ s.t.}\quad V (\mathbf{x}_{j}) \leq t_{U_{j}}-\delta \\ &\qquad \,\,\mbox{ constraints (<InternalRef RefID="Equ5">2.5</InternalRef>)\textendash (<InternalRef RefID="Equ9">2.9</InternalRef>) for }X^{{\prime}}\quad \qquad \,\,\,\,\mbox{ constraints (<InternalRef RefID="Equ5">2.5</InternalRef>)\textendash (<InternalRef RefID="Equ9">2.9</InternalRef>) for }X^{{\prime}} \\ &\qquad \,\,\sigma _{i}^{+},\,\sigma _{i}^{-}\in \{ 0,\,1\}\quad \qquad \qquad \qquad \qquad \,\,\,\,\sigma _{i}^{+},\,\sigma _{i}^{-}\in \{ 0,\,1\}\end{array} }$$
(2.14)

The left problem applied to cases where L j  ≥ 2 and returns the minimum number of changes that need to be made in the assignments of the reference actions in order to classify the non-reference alternative j to the set of categories \(\{C_{1},\ldots,C_{L_{j}-1}\}\). Similarly, the right problem applies to cases with U j  ≤ q − 1 and returns the minimum number of changes that need to be made in the assignments of the reference actions in order to classify the non-reference alternative j to the set of categories \(\{C_{U_{j}+1},\ldots,C_{q}\}\).

The support measure S j can then be defined as the minimum of the two objective functions at the optimal solutions of the two problems. When L j  = 1 and U j  = q, then S j is by definition equal to zero. In other cases, if S j is non-zero but low, then the DM may accept the changes identified through the solution of the above optimization models, thus forming a new reference set \(X_{j}^{{\prime}}\).

In order to compare the size of the feasible polyhedron corresponding to the new reference set to the one of the initially available reference set \(X^{{\prime}}\), we consider two measures based on well-known results from optimization theory.

The first measure is based on the radius of the largest ball inscribed inside the feasible polyhedron. Given a polyhedron {x | A x ≤ b}, the radius r of the largest ball inscribed in it can be computed from the following linear program [3]:

$$\displaystyle{ \begin{array}{lll} \max &r \\ \mbox{ s.t.}&\mathbf{a}_{i}^{\top }\mathbf{x} + r\|\mathbf{a}_{i}\|_{2} \leq b_{i},&\quad \forall \,i \end{array} }$$
(2.15)

where a i is the ith row of A.

This approach can be straightforwardly applied to find the radius r 0 of the largest ball inscribed inside the polyhedron (2.11) corresponding to the original reference set and compare it to the radius r j of the largest ball for the modified reference set \(X_{j}^{{\prime}}\). Then, the following robustness measure can be defined:

$$\displaystyle{ R_{j} = \frac{\log r_{0}} {\log r_{j}} }$$
(2.16)

The case R j  > 1 indicates that the modified reference set \(X_{j}^{{\prime}}\), which allows the classification of the non-reference alternative j outside its first computed range of assignments [L j , U j ], provides more options for choosing an acceptable decision model. Thus, the modification of \(X^{{\prime}}\) towards the new reference set \(X_{j}^{{\prime}}\) is likely to lead to more robust results. On the other hand, the case R j  < 1 indicates that the modified reference set is more restrictive compared to \(X^{{\prime}}\), which implies that this modification is more sensitive to changes of the reference set (i.e., less robust).

Alternatively to the above metric, the size of the polyhedron corresponding to the set of compatible decision models, can be assessed through the volume of the maximum ellipsoid inscribed inside the polyhedron. Compared to the above metric, this is a more suitable approach for irregular polyhedra, which can not be well described by the largest ball inscribed inside them (e.g., because they have large extremes).

The volume of the largest ellipsoid inside a polyhedron {x | A x ≤ b} can be found from the solution of the following convex optimization problem [3]:

$$\displaystyle{ \begin{array}{lll} \min &v =\log \det \mathbf{B}^{-1} \\ \mbox{ s.t.}&\|\mathbf{B}\mathbf{a}_{i}\|_{2} + \mathbf{a}_{i}^{\top }\mathbf{d} \leq b_{i},&\quad \forall \,i\end{array} }$$
(2.17)

where d is a vector of decision variables defining the center of the ellipsoid whose volume is proportional to \(\det \mathbf{B}\). Similarly to the previous measure, this optimization problem can be used to compare the volume of the largest ellipsoid inscribed inside the polyhedron (2.11) corresponding to the original reference set, against the volume for the modified reference set \(X_{j}^{{\prime}}\). The robustness measure in this case is defined as follows:

$$\displaystyle{ V _{j} = \frac{v_{0}} {v_{j}} }$$
(2.18)

Similarly to the interpretation of (2.16), the case V j  > 1 indicates that the modification of the original reference set to allow the classification of the non-reference alternative j outside its first computed range of assignments [L j , U j ], leads to more available options for selecting an acceptable decision model (i.e., higher robustness), versus the case V j  < 1, which corresponds to a small (less robust) polyhedron.

2.4 Illustrative Results

In order to examine the potentials of the data-driven robustness measures introduced in the previous section, we present results from their application to a data set taken from Mousseau et al. [18]. The data involve 100 alternatives evaluated on seven criteria (all in minimization form). The alternatives are classified in three performance categories: the high performance class (category H), the medium performance group (category M), and the low performance alternatives (class L).

For the purposes of the analysis, a reference set of 30 randomly selected alternatives (10 alternatives from each category) is used. Table 2.1 presents the results for the necessary (\(\mathcal{N}\)) and possible (\(\mathcal{P}\)) assignments of the 70 non-reference alternatives obtained with the chosen reference set, as opposed to their actual classification (columns). Out of the five alternatives actually belonging in the high performance class, four are assigned to the same category by all models compatible with the selected reference set (necessary assignments), whereas one alternative is classified by some ambiguity in classes H or M (possible assignments). Similarly, 17 out of the 28 alternatives from class M are classified in the same category by all models derived from the selected reference set. However, 11 alternatives from class M are classified with ambiguity: three can be classified in H or M, five can be classified in M or L, whereas three actions can be assigned to any of the three categories (H, M, L). Finally, 20 necessary assignments are specified for alternatives of class L, whereas the remaining 17 alternatives of this class are assigned to categories M or L (possible assignments).

Table 2.1 Necessary and possible assignments for the non-reference alternatives

To examine the robustness of the above results a resampling exercise is conducted. In particular, first a subsample of 20 alternatives is selected, at random, from the initial chosen reference set of 30 actions. Using this subsample as a new reference set, the necessary and possible assignments are computed for all of the 70 non-reference alternatives. A single AVF model is also constructed through formulation (2.4)–(2.10) and it is used to specify a single assignment for each one of the non-reference actions. The same experiment is repeated 100 times, each based on a different random subsample (new reference set) of 20 alternatives.

In each one of the above 100 tests, the best and worst assignments are identified for all non-reference alternatives. Table 2.2 presents the average frequencies with which each non-reference action is classified in the three categories. The results are reported in comparison to the necessary and possible assignments identified through the original reference set of 30 actions. Discrepancies between the results from the full reference set and the ones obtained from the 100 random tests are shown in bold.

Table 2.2 Classification frequencies (in %) with the full set of AVFs corresponding to different perturbations of the reference set

For the alternatives necessarily assigned to category H, the simulation tests are mostly consistent with the necessary assignments. There is only a small likelihood (2.5 %) that an action necessarily assigned to class H under the full set might be downgraded to category M if the reference set changes. However, the discrepancies for the two other categories are higher. For instance, for the alternatives that are necessarily assigned to category M with the full reference set, there is a significant likelihood (22.6 %) that will be upgraded to category H if the reference set changes. There is also a notable likelihood (19.7 %) for downgrading these alternatives to the low performance class L. Thus, claiming that these alternatives are consistently assigned to class M under all models compatible with the reference, does not seem to be a very robust conclusion, because variations of the reference set often lead to different outcomes.

The same also holds true for alternatives that are necessarily assigned to the low performance class L under the full reference set. In this case, there is notable likelihood (25.6 %) that they could be upgraded to the medium performance category M with a perturbed reference set, whereas the likelihood of an even further upgrade to class H is 4.6 %.

Similar discrepancies are also observed for the possible assignments, which are expressed in interval form. For instance, focusing on the alternatives that can be classified in H or M under the full reference set, the simulation test indicates that they could actually be classified to category L with some perturbation of the reference set.

Table 2.3 presents similar results with a single AVF model, obtained through the solution of problem (2.4)–(2.10) for each reference set in the 100 test runs. In this case smaller discrepancies are observed (shown in bold) between the results obtained with a single decision model (columns) and the necessary/possible assignments derived from the full reference set (rows). This should be of no surprise, as a single model does not provide information about extreme assignments like those considered in the above results.

Table 2.3 Classification frequencies (in %) with a single AVF for random perturbations of the reference set

The above obtained results support the argument in this study that similarly to point recommendations derived with a single decision model (AVF), interval results formulated on a set of decision models are also subject to the robustness concern when the reference data change.

Table 2.4 reports some results about the support measure and the uncertainty of the assignments for the non-reference alternatives. Uncertainty is defined as the entropy of the assignments over the 100 test runs, with higher entropy values indicating higher ambiguity in the obtained classifications. Results are presented for the extreme (best and worst) assignments as well as for the assignments obtained with a single AVF. For the extreme assignments only the cases with positive support are considered because, as explained earlier a zero support indicates that the possible assignments cover all classes (e.g., from H to L in this example). For the results of the single AVF we also consider the cases with zero support to examine how ambiguous alternatives are classified when a single decision model is used. The obtained results clearly indicate that higher support is associated with lower ambiguity (i.e., lower entropy values) for all classifications, both the interval ones and the single AVF model assignments.

Table 2.4 Entropy of assignments vs support

Regarding the two robustness indicators (2.16)–(2.18) that consider the size of the feasible polyhedron, they were found to be highly correlated to each other (Pearson correlation higher than 0.85) and strongly negatively correlated to the support measure (correlation about − 0. 6). The latter result implies the robustness of the assignments for non-reference alternatives with low support can be improved by reconsidering the evaluations of the supporting reference actions.

Table 2.5 provides details about the average values of the robustness indicators R and V, as defined by (2.16)–(2.18), for all assignments of the non-reference alternatives (the results are averages over the 100 tests). It is evident that both indicators attain their maximum values when the alternatives are classified in their respective necessary assignments. For instance, for alternatives assigned in category H by all models compatible with the full reference set, both R and V are equal to one for class H, whereas their value is lower for classes M (R = 0. 81, V = 0. 76) and L (R = 0. 84, V = 0. 75). Thus, both indicators confirm that H is the most robust assignment for these alternatives. The same holds for alternatives necessarily assigned to classes M and L using the full reference set. For alternatives for which the full reference set indicates that the can be classified in H or M (possible assignments), again the two indicators verify that these are the most robust conclusions (classes H and M correspond to higher values in R and V compared to class L). Similar, conclusions are also drawn for alternatives possibly assigned to M or L. These results, indicate that the two proposed robustness indicators are in accordance with the definitions of necessary and possible assignments, and enhance them with additional information that provides an analytic estimate of the robustness of the results, without requiring to resort to approximate simulation-based approaches.

Table 2.5 The robustness indicators for all assignment results (non-reference alternatives)

As a final test for the information content and validity of the two proposed indicators we consider the classification of the alternatives whose classification is ambiguous according to the reference set used in the analysis. These are 29 non-reference alternatives for which only their possible assignments could be defined (i.e., the alternatives classified in {H, M}, {M, L}, or {H, M, L}. To specify a single classification result for these cases we compare three different approaches:

  1. 1.

    For each of the 100 perturbations of the reference set, construct a single AVF model, use it to classify the alternatives, and finally use a majority rule to aggregate the 100 results for each alternative and specify the most appropriate class assignment.

  2. 2.

    Classify the alternatives to the class for which the R measure is highest.

  3. 3.

    Classify the alternatives to the class for which the V measure is highest.

The results of these three procedures are compared against the actual classification of the alternatives. The accuracy rate (i.e., the percentage of correct classifications) for the assignments obtained through the majority rule was found to be 89.7 %, the assignments with the R measure had an accuracy rate of 82.8 %, whereas using the V measure led to an accuracy of 96.6 %. These results indicate that the two robustness indicators can constitute the basis for formulating good recommendations about the most appropriate classification when a reference set leads to ambiguous conclusions. Between the two indicators, the one based on the volume of the ellipsoid inscribed inside the feasible polyhedron (V ) appears to provide better results.

2.5 Conclusions and Future Research

The robustness of MCDA models has been an active research topic recently having attracted a lot of interest from different perspectives. In this chapter we focused on the PDA framework for constructing decision models from data related to classification problems. PDA is based on a data-driven scheme. As such, changes in the data used to construct a decision model can have a significant impact on the results.

Motivated by this fact, this study presented simple, yet effective ways to assess the robustness of MCDA models in the form of AVFs for classification problems. The proposed measures provide analytic estimates of the ambiguity resulting from the information that a given data set provides, based on tools and techniques from optimization theory. The analytic form of the measures introduced in this study makes them applicable to all cases, even when dealing with large problem instances (i.e., reference sets with many actions and criteria).

The illustrative results presented in this chapter indicate that the proposed measures enhance existing robust MCDA techniques with additional information. Their connection with the concept of robustness in the data-driven context explained above was verified and their usefulness for formulating better decision recommendations was demonstrated.

However, the positive properties of the measures introduced in this study and the preliminary results should be further explored. To this end, applications to large real data sets and further experimental testing will provide further insights. Comparisons with simulation-based approaches could also be useful to construct an unified framework for analyzing robustness and assess the statistical properties of the proposed measures. Finally, extensions to other types of decision problems, including ordinal regression [11] should be examined, together with an analysis of cases where inconsistencies, uncertainties, and fuzziness are present in the data.