1 Introduction

Multi-criteria decision analysis/multiple criteria decision aiding (MCDA) acknowledges most preferences are based on multiple evaluation criteria, and thereby making choices involves compromises between these criteria. Reference works (e.g., Greco et al. 2016) showcase the richness of the MCDA field concerning preference models, interaction methods, and applications. A significant and still evolving stream within MCDA concerns preference disaggregation methods (Jacquet-Lagrèze and Siskos 2001). These methods circumvent the need to elicit from a Decision Maker (DM) every parameter defining a preference model, by being able to infer it from holistic judgments provided by the DM. Pioneering preference disaggregation methods, namely the UTA family of methods (Jacquet-Lagrèze and Siskos 1981; Siskos et al. 2016; Matsatsinis et al. 2018), focused on the additive value function model (Keeney and Raiffa 1993). Subsequently, disaggregation methods have been developed for quite distinct models, such as ELECTRE (e.g., Mousseau and Slowinski 1998; Mousseau and Dias 2004; Dias and Mousseau 2018) and fuzzy capacities (e.g., Marichal and Roubens 2000; Angilella et al. 2010). Disaggregation methods, also called ordinal regression methods, typically base the parameter inference on a subset of real or fictitious reference alternatives for which the DM has provided a ranking from the best to the worst (Jacquet-Lagrèze and Siskos 1981; Figueira et al. 2009), a rating of the alternatives (Grigoroudis and Siskos 2002) or a classification among sorted categories (Dias et al. 2002; Zopounidis and Doumpos 2002), according to his/her holistic appreciation of the alternatives, or according to decisions made in the past.

Under the scope of additive value function models, UTA methods use mathematical programming to infer as well as possible the judgment provided by the DM. The objective is to minimize an error function, enabling these methods to provide a solution even if the judgments are not entirely consistent with an additive value function. Usually, there are multiple models equally good according to the objective function and several UTA variants are available to recommend one of the alternative optimal solutions seeking a secondary objective (Siskos et al. 2016; Matsatsinis et al. 2018). Other approaches explore the set of all the functions compatible with the DM’s judgments to perform robustness analyses (e.g., Figueira et al. 2009), from which a representative solution can be derived (Kadziński et al. 2012). More recently, the regularization framework has been proposed as a means to take into consideration not only the minimization of the error, but also the smoothness of the inferred value functions, possibly trading off these two objectives (Doumpos and Zopounidis 2007; Ghaderi et al. 2017; Kadziński et al. 2017; Liu et al. 2019). The present study aims at contributing to this methodological stream as well as to analyze, through an empirical study, its application to learning the preferences of a population towards vehicle technologies.

From a methodological point of view, this work has two noteworthy aspects: the proposal of simple mathematical formulations to obtain typical value-function shapes and the type of judgments used as an input. Concerning value function shapes, the context of this study allows to stipulate they are either increasing or decreasing, thus not requiring approaches that accept nonmonotonic functions (Ghaderi et al. 2017). In one formulation, there are no constraints to the shape of the value function. In a second formulation, the value functions are required to be one of the four typical shapes observed in practice (Parnell et al. 2013). Recent research has proposed sophisticated regularization approaches to control variation in the slopes of the value functions (Ghaderi et al. 2017; Ghaderi and Kadziński 2020). This avoids namely sudden increases from 0 to 1, or having neglected value functions, i.e., always null. In the present work, a different way of avoiding such extreme value functions is experimented with, consisting in simply maximizing the minimum slope. Details of these variants are provided in Sect. 2.

Concerning the judgements used as an input, this work considers choice-based questions of the type often used in marketing research (Green and Srinivasan 1990), rather than taking a ranking or a classification of alternatives as usual. Choice-based questions match well a purchase context (Jaeger et al. 2001), and the choice of one among a small number of alternatives (three at a time in this study) is cognitively less demanding than providing a ranking of a larger number of alternatives.

Asking multiple choice-base questions, on the other hand, does not preclude intransitive judgments, which cannot occur if the required input is a ranking of the reference alternatives. The literature on behavioral decision making shows that DMs are often inconsistent and subject to biases (Morton and Fasolo 2009; Montibeller and von Winterfeldt 2018). This makes the present study also relevant from a behavioral point of view, adding to the existing literature on the adequacy of additive value functions (Vetschera 2006; Schilling et al. 2007; Korhonen et al. 2012; Vetschera et al. 2014b; Lienert et al. 2016; Ishizaka and Siraj 2018).

From an empirical point of view, this work uses choice-based questions and MCDA analysis of more than a hundred individuals and reports to which extent they were consistent, what do their value functions inferred from choice-based questions look like, and how well do these functions represent their preferences for alternative vehicle technologies. Therefore, this work also partly contributes to learn more about consumer preferences towards the attributes of passenger vehicles, in particular concerning the electrification of powertrains (considering electric and hybrid vehicles besides traditional gasoline and diesel ones). It aims at observing to which extent the additive value function fits the preferences of a sample of Portuguese consumers, and what are the most common value function shapes in this regard. This is a relevant application topic, given the promise of these vehicles to decrease the environmental burden of car travel and also due to the slow adoption of such vehicles in most markets, having originated many studies to understand consumer preferences (Oliveira and Dias 2019). This study therefore adds to the literature on using disaggregation approaches to learn about the preferences of consumers (for a review see Siskos et al. 2016). A noteworthy recent example, using a sample of 94 individuals asked to rank 10 real phone contracts, is provided by Ghaderi and Kadziński (2020), who propose using knowledge of the preferences of a population to assist in learning individual preferences. Knowing a typical value function shape is also relevant for stochastic multi-attribute acceptability analyses (SMAA) simulating specific value function types (Dias and Vetschera 2019a).

Following this introduction, Sect. 2 presents the notation and mathematical models developed for this work. Next, Sect. 3 presents the empirical study and Sect. 4 presents the results. Section 5 presents conclusions and topics for future research suggested by this work.

2 Mathematical programming formulations

2.1 Notation and basic formulation

This research assumes, as a hypothesis, that preferences can be modelled by an additive value function and the appropriate mutual preference independence condition holds (Keeney and Raiffa 1993). Let G = {g1(.), …, gn(.)} denote a set of n criteria and let A denote a set of alternatives. Each alternative \(a\in A\) is fully characterized by a vector \(\left({g}_{1}\left(a\right),\ldots ,{g}_{n}\left(a\right)\right)\) indicating its performance according to the multiple criteria. According to the additive value function model, the value of an alternative a is given by

$$ v\left( a \right) = v\left( {g_{1} \left( a \right), \ldots ,g_{n} \left( a \right)} \right) = \mathop \sum \limits_{j = 1}^{n} v_{j} \left( {g_{j} \left( a \right)} \right), $$
(1)

where \({v}_{j}\left({g}_{j}\left(a\right)\right)\) denotes the value of a according to the j-th criterion and \(v\left(a\right)\) denotes the overall value of a.

The performance vectors are known but the value functions are unknown. Each value function \({v}_{j}\) is defined for a domain \(\left[{g}_{j*},{g}_{j}^{*}\right]\). Without loss of generality, the following equations assume \({g}_{j*}\) is the least preferred performance and \({g}_{j}^{*}\) is the most preferred performance (non-decreasing value function), but these can be easily adapted to the opposite situation (alternatively, the scale of the attribute can be reversed multiplying it by -1). As usual in UTA methods (Siskos et al. 2016), the value functions are normalized such that \({v}_{j}\left({g}_{j*}\right)=0\) and \(\sum_{j=1}^{n}{v}_{j}\left({g}_{j}^{*}\right)=1\).

UTA formulations build piece-wise value functions. The breakpoints for these piecewise value functions \({v}_{j}(.)\) are denoted \({B}_{j}=\left\{{b}_{j}^{0},\ldots ,{b}_{j}^{{\alpha }_{j}}\right\}\), and are fixed in advance, with \({g}_{j*}={b}_{j}^{0}\le \cdots \le {b}_{j}^{{\alpha }_{j}}={g}_{j}^{*}\). Each value function can be associated with as many breakpoints as desired. Usually the breakpoints are equidistant (Siskos et al. 2016) or they coincide with the performances of the alternatives evaluated by the DMs (Figueira et al. 2009). The following formulations intend to be general by making no assumptions about the breakpoints. The formulations follow closely other references in determining the value functions based on the slope of each segment (Doumpos and Zopounidis 2007; Liu et al. 2019).

Let \({v}_{j}^{k}={v}_{j}\left({b}_{j}^{k}\right)\) denote the value corresponding to breakpoint \({b}_{j}^{k}\). Let \({s}_{j}^{k}=\left({v}_{j}^{k}-{v}_{j}^{k-1}\right)/\left({b}_{j}^{k}-{b}_{j}^{k-1}\right)\) denote the slope of the line segment from point \(\left({b}_{j}^{k-1}, {v}_{j}^{k-1}\right)\) to point \(\left({b}_{j}^{k}, {v}_{j}^{k}\right)\). Then, if \({g}_{j}\left(a\right)\in \left[{b}_{j}^{k-1},{b}_{j}^{k}\right]\), for any k > 0, linear interpolation yields:

$${\mathrm{v}}_{\mathrm{j}}\left({\mathrm{g}}_{\mathrm{j}}\left(\mathrm{a}\right)\right)={\mathrm{v}}_{\mathrm{j}}^{\mathrm{k}-1}+\frac{{\mathrm{g}}_{\mathrm{j}}\left(\mathrm{a}\right)-{\mathrm{b}}_{\mathrm{j}}^{\mathrm{k}-1}}{{\mathrm{b}}_{\mathrm{j}}^{\mathrm{k}}-{\mathrm{b}}_{\mathrm{j}}^{\mathrm{k}-1}}\left({\mathrm{v}}_{\mathrm{j}}^{\mathrm{k}}-{\mathrm{v}}_{\mathrm{j}}^{\mathrm{k}-1}\right)={\mathrm{v}}_{\mathrm{j}}^{\mathrm{k}-1}+\left({\mathrm{g}}_{\mathrm{j}}\left(\mathrm{a}\right)-{\mathrm{b}}_{\mathrm{j}}^{\mathrm{k}-1}\right){\mathrm{s}}_{\mathrm{j}}^{\mathrm{k}}$$
(2)

Now, since \({v}_{j}^{k}={v}_{j}^{k-1}+\left({b}_{j}^{k}-{b}_{j}^{k-1}\right){s}_{j}^{k}\) and \({v}_{j}^{0}=0\), \({v}_{j}\left({g}_{j}\left(a\right)\right)\) can be written as

$$ v_{j} \left( {g_{j} \left( a \right)} \right) = \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( a \right) \cdot s_{j}^{k} , $$
(3)

with \(\tau_{j}^{k} \left( a \right) = \left\{ {\begin{array}{*{20}l} {b_{j}^{k} - b_{j}^{k - 1} ,} \hfill &\quad {{\text{if}}\, g_{j} \left( a \right) \ge b_{j}^{k} } \hfill \\ {g_{j} \left( a \right) - b_{j}^{k - 1} ,} \hfill &\quad {{\text{if}} \,b_{j}^{k - 1} < g_{j} \left( a \right) < b_{j}^{k} } \hfill \\ {0,} \hfill &\quad {{\text{if}} \,g_{j} \left( a \right) \le b_{j}^{k - 1} } \hfill \\ \end{array} } \right.\).

Considering Bj and \({g}_{j}\left(a\right)\) (\({g}_{j}\)∈G, a ∈ A) as given inputs, the positive constants \({\tau }_{j}^{k}\) can be readily determined and then \({v}_{j}\left({g}_{j}\left(a\right)\right)\) becomes a linear function of \({s}_{j}^{k}\).

Let J denote a set of ordered pairs of alternatives in a reference set \({A}_{R}=\left\{{a}_{1},\ldots ,{a}_{m}\right\}\) for which the DM’s preference is known, such that \(\left({a}_{x},{a}_{y}\right)\in J\iff {a}_{x}\succ {a}_{y}\) (\({a}_{x}\) is preferred to \({a}_{y}\)). In the present work, J is obtained from choices involving triplets of options. If for instance the DM chooses ax when asked to choose from \(\left\{{a}_{x},{a}_{y},{a}_{z}\right\}\in {A}_{R}\), then \(\left({a}_{x},{a}_{y}\right)\) and \(\left({a}_{x},{a}_{z}\right)\) are added to J. The following linear program can then be used to infer a piece-wise additive value function that reproduces, exactly or approximately, the judgements in J:

MP1:

$$ \begin{aligned} & {\text{minimize}}\quad z = \mathop \sum \limits_{{\left( {a_{x} ,a_{y} } \right) \in J}} z_{xy} \\ & {\text{Subject to}}: \\ & \quad \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{x} } \right).s_{j}^{k} } \right) - \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{y} } \right).s_{j}^{k} } \right) + z_{xy} \ge \varepsilon , \quad \forall \left( {a_{x} ,a_{y} } \right) \in J \\ & \quad \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \left( {b_{j}^{k} - b_{j}^{k - 1} } \right) \cdot s_{j}^{k} = 1 \\ & \quad s_{j}^{k} \ge 0, \quad \forall j,k \\ & \quad z_{xy} \ge 0, \quad \forall \left( {a_{x} ,a_{y} } \right) \in J \\ \end{aligned} $$

According to this linear programming (LP) formulation, the objective is to minimize the sum of the error terms (\({z}_{xy}\)) associated with each judgement in the reference set. The function slopes (\({s}_{j}^{k}\)) and the error terms are the decision variables. The first set of constraints (one per pairwise judgment) indicates that the value of the two alternatives involved in a judgment respect the stated preference (in this study a constant \(\varepsilon =0.001\) was used to represent a strict inequality). The second constraint normalizes the solution such that the value of a fictitious ideal alternative \(\left({b}_{1}^{{\alpha }_{1}},\ldots ,{b}_{n}^{{\alpha }_{n}}\right)\) would be equal to 1, placing an upper bound to the value functions. The definition of \({\tau }_{j}^{k}(a)\), together with Eq. (3), imply the lower bound \({v}_{j}\left({g}_{j*}\right)=0\). The remaining constraints require all the variables, i.e., the error terms and the slopes, to be non-negative. If the optimal solution to this linear program z* is null, then the optimal slopes define an additive value function that respects all the preference statements without any error.

The above formulation treats the error terms in a way similar to Ghaderi et al. (2017) (each error term is associated with a constraint) rather than the typical UTA formulation (Siskos et al. 2016) (each error term is associated with an alternative). This is due to the possibility that J contains intransitive or inconsistent judgments. In this work, as described below in Sect. 3, judgments in J result from choice experiments where the same alternatives are repeated in the choice questions according to a fractional factorial design. It cannot therefore be ruled out that, for instance, a DM chooses a1 when asked to choose from {a1, a2, a3}, and chooses a2 when asked to choose from {a2, a3, a5}, and chooses a5 when asked to choose from {a1, a4, a5}. Together this implies intransitive judgments: \({a}_{1}\succ {a}_{2}\), \({a}_{2}\succ {a}_{5}\) and \({a}_{5}\succ {a}_{1}\). When two alternatives appear in different questions, it may even happen the DM chooses a1 in one question where a2 is available, and later chooses a2 in one question where a1 is available. If one of these issues occurs, the standard UTA formulation, UTASTAR (Siskos et al. 2016), becomes infeasible, as it has been developed to handle a ranking (consistent and transitive) provided by the DM. Hence, in this work the error terms are associated with constraints.

2.2 Post-optimization formulation

Many different value functions can lead to the same optimal value z* in MP1. The literature on disaggregation approaches offers many possibilities to deal with alternative optima. For instance, UTASTAR (Siskos et al. 2016) solves several post-optimization LPs to find multiple extreme value functions, which are then averaged. This avoids value functions that might be considered too extreme, for instance placing all the weight on a single criterion by having \({v}_{k}\left({g}_{k}^{*}\right)=1\) for some criterion and \({v}_{j}\left({g}_{j}^{*}\right)=0, \forall j\ne k\).

In this work we try out a different and very simple way of seeking less extreme value functions, by trying to maximize the minimum slope among all the value functions. Thus, only one LP is solved after MP1 to maximize the minimum slope \({s}_{min}\) without accepting an error greater than z* obtained in MP1, here denoted \({z}_{MP1}^{opt}\):

MP1 po (MP1 post-optimization):

$$ \begin{aligned} & {\text{maximize}}\quad s_{min} \\ & {\text{Subject to}}: \\ & \quad \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{x} } \right) \cdot s_{j}^{k} } \right) - \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{y} } \right) \cdot s_{j}^{k} } \right) + z_{MP1}^{opt} \ge \varepsilon ,\quad \forall \left( {a_{x} ,a_{y} } \right) \in J \\ & \quad \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \left( {b_{j}^{k} - b_{j}^{k - 1} } \right) \cdot s_{j}^{k} = 1 \\ & \quad s_{j}^{k} \ge s_{min} /\left( {b_{j}^{{\alpha_{j} }} - b_{j}^{0} } \right), \quad \forall j,k \\ & \quad s_{min} \ge 0 \\ \end{aligned} $$

The first constraint uses a constant slack \({z}_{MP1}^{opt}\), the optimal value of MP1, so that the total error does not increase when the minimum slope is maximized (of course, as it was optimal in MP1, the error will not decrease either). The constraint \({s}_{j}^{k}\ge {s}_{min}/({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0})\) takes into account the different amplitudes of the criteria scales. This constraint limits the slope relatively to the amplitude of each scale (\({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0}\)), considering this amplitude as representing one unit of change in the performance axis. In this way, the \({s}_{min}\) lower bound can be applied to all the criteria simultaneously. When maximizing \({s}_{min}\), its optimum value cannot exceed 1/n. Indeed, if \({s}_{j}^{k}>1/n({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0}), \forall j,k\), then \(\sum_{j=1}^{n}\sum_{k=1}^{{\alpha }_{j}}\left({b}_{j}^{k}-{b}_{j}^{k-1}\right).{s}_{j}^{k}>\sum_{j=1}^{n}\sum_{k=1}^{{\alpha }_{j}}\left({b}_{j}^{k}-{b}_{j}^{k-1}\right)\times 1/n({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0})=\sum_{j=1}^{n}({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0})\times 1/n({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0})=1\), but this would not be possible since it violates the constraint \(\sum_{j=1}^{n}\sum_{k=1}^{{\alpha }_{j}}\left({b}_{j}^{k}-{b}_{j}^{k-1}\right).{s}_{j}^{k}=1\). Thus, in the limit one has \({s}_{j}^{k}=1/n({b}_{j}^{{\alpha }_{j}}-{b}_{j}^{0}), \forall j,k\), corresponding to linear value functions and the same weights \({v}_{j}\left({g}_{j}^{*}\right)=1/n\) for all the criteria.

2.3 Shape-constrained formulations

Shape-constrained formulations will also be solved as an alternative to MP1 and MP1po. These new formulations constrain all the value functions to have one of the four typical shapes observed in practice: linear, concave, convex, or S-shaped (Parnell et al. 2013), see Fig. 1.

Fig. 1
figure 1

Value function shapes (solid = increasing, dashed = decreasing): a linear, b concave, c convex, d S-shaped, e atypical shape

As an alternative to MP1, the following 0–1 linear program will attempt to find a solution constraining the shape of each value function to be either concave, or convex, or S-shaped (including the linear value function as particular cases). Considering an increasing value function, an S-shaped function is initially convex as value increases up to some inflection point \({g}_{j}^{i}\), and then becomes concave. If \({g}_{j}^{i}={g}_{j}^{*}\), then the function is convex in its domain \(\left[{g}_{j*},{g}_{j}^{*}\right]\); if \({g}_{j}^{i}={g}_{j*}\), then the function is concave in its domain. Thus, constraining the function to be S-shaped for some unknown inflection point \({g}_{j}^{i}\in \left[{g}_{j*},{g}_{j}^{*}\right]\) (note this includes \({g}_{j*}\) and \({g}_{j}^{*}\)) can lead to any of the typical value functions, excluding functions such as e) in Fig. 1.

For a concave function the slope is non-increasing, \({s}_{j}^{k-1}\ge {s}_{j}^{k}\ge {s}_{j}^{k+1}\), for a convex function the slope is non-decreasing, \({s}_{j}^{k-1}\le {s}_{j}^{k}\le {s}_{j}^{k+1}\), and for an S-shaped function (convex to concave), the slope function must be quasi-concave, i.e., the slope cannot decrease and then increase again: \({s}_{j}^{k}\ge {s}_{j}^{k-1}\) or \({s}_{j}^{k}\ge {s}_{j}^{k+1}\). The S-shaped includes as particular cases the conditions for being convex or concave over the entire domain, and therefore the following mathematical program only imposes this constraint, being similar to MP1 in all remaining aspects:

MP2:

$$ \begin{aligned} & {\text{minimize}}\quad z = \mathop \sum \limits_{{\left( {a_{x} ,a_{y} } \right) \in J}} z_{xy} \\ & {\text{Subject to}}: \\ & \quad \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{x} } \right).s_{j}^{k} } \right) - \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{y} } \right) \cdot s_{j}^{k} } \right) + z_{xy} \ge \varepsilon ,\quad \forall \left( {a_{x} ,a_{y} } \right) \in J \\ & \quad \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \left( {b_{j}^{k} - b_{j}^{k - 1} } \right) \cdot s_{j}^{k} = 1 \\ & \quad s_{j}^{1} ,s_{j}^{{\alpha_{j} }} \ge 0,\quad \forall j \\ & \quad \left. { \begin{array}{*{20}l} {s_{j}^{k} \ge s_{j}^{k - 1} - Mo_{j}^{k} } \hfill \\ {s_{j}^{k} \ge s_{j}^{k + 1} - M + Mo_{j}^{k} } \hfill \\ \end{array} } \right\}\forall j,\quad \forall k \in \left\{ {2, \ldots ,\alpha_{j} - 1} \right\} \\ & \quad z_{xy} \ge 0, \forall \left( {a_{x} ,a_{y} } \right) \in J \\ & \quad o_{j}^{k} \in \left\{ {0,1} \right\}, \quad \forall j,\quad \forall k \in \left\{ {2, \ldots ,\alpha_{j} - 1} \right\} \\ \end{aligned} $$

The main difference compared to MP1 is the introduction of the binary variables \({o}_{j}^{k}\), which implement the non-exclusive OR condition \({s}_{j}^{k}\ge {s}_{j}^{k-1}\vee {s}_{j}^{k}\ge {s}_{j}^{k+1}\) when constant M is a sufficiently large number.

After solving MP2 it is possible to obtain a more regular value function by maximizing the minimum slope \({s}_{min}\), similarly to MP1po:

MP2 po (MP2 post-optimization)

$$ \begin{aligned} & {\text{maximize}}\quad s_{min} \\ & {\text{Subject to}}: \\ & \quad \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{x} } \right) \cdot s_{j}^{k} } \right) - \left( {\mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \tau_{j}^{k} \left( {a_{y} } \right) \cdot s_{j}^{k} } \right) + z_{MP2}^{opt} \ge \varepsilon , \quad \forall \left( {a_{x} ,a_{y} } \right) \in J \\ & \quad \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{{\alpha_{j} }} \left( {b_{j}^{k} - b_{j}^{k - 1} } \right) \cdot s_{j}^{k} = 1 \\ & \quad s_{j}^{1} \ge s_{min} /\left( {b_{j}^{{\alpha_{j} }} - b_{j}^{0} } \right) \wedge s_{j}^{{\alpha_{j} }} \ge s_{min} /\left( {b_{j}^{{\alpha_{j} }} - b_{j}^{0} } \right),\quad \forall j \\ & \quad \left. { \begin{array}{*{20}l} {s_{j}^{k} \ge s_{j}^{k - 1} - Mo_{j}^{k} } \hfill \\ {s_{j}^{k} \ge s_{j}^{k + 1} - M + Mo_{j}^{k} } \hfill \\ \end{array} } \right\}\forall j,\quad \forall k \in \left\{ {2, \ldots ,\alpha_{j} - 1} \right\} \\ & \quad s_{min} \ge 0 \\ & \quad o_{j}^{k} \in \left\{ {0,1} \right\}, \quad \forall j,\quad \forall k \in \left\{ {2, \ldots ,\alpha_{j} - 1} \right\} \\ \end{aligned} $$

The differences between MP2po and MP2 are similar to the differences between MP1po and MP1, i.e., the fixed slack \({z}_{MP2}^{opt}\) (a constant equal to the optimal value z* of MP2) ensures the global error does not increase and the minimum slope is maximized taking into account the amplitudes of the criteria scales.

Figure 2 illustrates the difference in value functions obtained for a given DM (with null error) using MP1 and MP2po (the criteria are described in the next section). The MP2po formulation not only avoids the atypical shape in the first criterion, it also avoids null slopes and as a consequence value is more equally distributed among the criteria.

Fig. 2
figure 2

Illustration of four value functions obtained by MP1 (above) and MP2po (below) for a given DM

3 Application to learning preferences for vehicle technologies

Transportation is responsible for an important share fossil fuels use, associated with 24.6% of the greenhouse gas emissions in the EU in 2017 (European Comission 2019). Yet, uptake of alternative fuel vehicles, such as electric and hybrid vehicles, has been slow, motivating many studies to understand consumer preferences (e.g., Oliveira and Dias 2019; Christidis and Focas 2019), as well as the context used in the present study.

The dataset for this research was built from face-to-face interviews with 128 voluntary participants, potential buyers of a vehicle. Basic demographic information is presented in Table 1. This sample is not representative of the entire Portuguese population, as younger consumers (55.5%) and males (63.3%) are clearly overrepresented.

Table 1 Population interviewed

Each participant performed two tasks. First, he or she responded to a stated preference questionnaire with 14 questions (in “Appendix”). Each question presented three alternatives from which the participant should indicate the most preferred one. The use of three alternatives is common for choice-based experiments (e.g., Caulfield et al. 2010; Hoen and Koetse 2014; Wolbertus et al. 2018) to approximate the experiment to a real purchase context and also because it increases the efficiency of the survey design (Kuhfeld et al. 1994; Pinnell and Englert 1997).

The 14 questions result from a fractional factorial design, obtained from XLSTAT software. The design combines the attribute levels in Table 2, in order to minimize the number of questions necessary to estimate efficiently preferences for different products (Kuhfeld 2003), as usually done in consumer preference studies (e.g., Caulfield et al. 2010; Hackbarth and Madlener 2016). Although the levels are realistic, their combinations do not necessarily correspond to a real alternative.

Table 2 Attribute levels used for the fractional factorial design

The set of criteria was chosen based on a previous study (Oliveira and Dias 2015) where Portuguese consumers were asked to name criteria important to them, but excluding criteria that are not specific to the vehicle powertrain (e.g., manufacturer, aesthetics): Cost to purchase, Range (distance that can be driven without fueling/charging the vehicle), Mileage (measured in terms of fuel cost per 100 km), and Emissions (CO2 emissions per km). The answers to the 14 questions define the set of judgments J for each respondent (28 preference statements, as the participants always needed to prefer one of the choices).

The answers are used to infer a value function as described in Sect. 2, making it possible to forecast how these respondents would evaluate a set of vehicles based on those existing in the market, depicted in Table 3: BEV (Battery Electric Vehicle); BEV+ (Battery Electric Vehicle with greater range); HEV (Hybrid Electric Vehicle); Gasoline (Gasoline vehicle); Diesel (Diesel vehicle); and PHEV (Plug-in Hybrid Electric Vehicle). The performances of these vehicles lay within the bounds used in Table 2. However, it was not possible to know the real preferences (revealed preferences) of the participants concerning those vehicles.

Table 3 Performance table (based on representative vehicles in the market)

As an approximation, a second task in the interview consisted in examining these vehicles in detail through a step-by-step construction of a value function for each criterion using the bisection method, followed by trade-off questions for weights elicitation, as traditionally done to elicit an additive value function (Belton and Stewart 2002; Goodwin and Wright 2014). The bisection method was chosen as it leads the participants to apprehend the concept of strength of preference (or preference intensity) better than simply asking for a direct rating. The trade-off questions were chosen instead of a swings-based elicitation as these questions lead the participants to acknowledge and fine-tune compensation among criteria. This requires the respondents to think hard about their preferences, which can be a benefit, but it did not demand too much effort as the number of bisection questions (three per criterion) and trade-off questions (three in total) was not very high.

The interviewers were M.Sc. and Ph.D. students specifically trained for this task, acting as decision analysts. The respondents were able to appreciate the global value obtained by each vehicle and the resulting ranking, and they were given the opportunity check if they agreed with it entirely. If they did not, they could change any part of it to match their preferences, based on what they experienced and learnt during the process. The option of allowing amending the final ranking was chosen in order to ensure the interview would not take too much time from the respondents. With more time available, the analyst could lead the respondent to review all the elicitation answers from the beginning, as occurred in a study by Keeney et al. (1990), which showed that results obtained through an MCDA elicitation process over two days frequently did not match the intuitive values given by the participants at the outset.

The second task was always performed after the first one, since due to potential learning, the answers might differ if the order of the tasks was reversed, as studied by Oliveira and Dias (2020) on a different study focused on this issue. For comparison purposes, the final ranking thus obtained is considered in this study as a good approximation to the respondent’s real preferences, since this task required them to ponder their choices in a structured way and also because they could change the ranking if they disagreed.

The data collected in these interviews is used to analyze different aspects relevant in this research:

  • The extent to which DMs are consistent in choice-based questions and their preferences can be represented by an additive value function;

  • What is the “premium”, in terms of increased error, of demanding a typical value function shape;

  • How much different are value functions obtained with formulations MP1, MP1po, MP2 and MP2po;

  • How well do the rankings and choices that would be obtained using the inferred value functions match the “real” final rankings and top choices of the respondents obtained in the second task (concerning the alternatives in Table 3).

4 Results

4.1 Error (internal and external to choice questions)

Many inconsistencies were observed. The design of the 14 questions did not exclude dominated alternatives (a total of four, see questions 4, 5, 12, 14 in “Appendix”), which is a well-known downside of stated preference surveys fulfilling orthogonality (Kuhfeld 2003). This nevertheless allows testing the respondents’ attentiveness to the attribute levels and definitions. If a respondent answered these four questions randomly, the probability of picking a dominated alternative would be, according to a binomial distribution B(4,1/3), around 0.80. Since each question presented only three alternatives evaluated in only four criteria, picking the dominated alternative should be rare. Among the 128 respondents, however, 8 of them picked a dominated alternative. This proportion (6.3%) is not far from the rate of obvious monotonicity or dominance violations found in other studies, e.g., 5.0% in a study by Ciomek et al. (2017), 10.8% in a study by Beccacece et al. (2015), or 9.0–17.9% reported by Vetschera et al. (2014b). As the choice task required in our study was much simpler, the few individuals violating dominance might have provided random answers or might have misunderstood the criteria preference directions, and for this reason were excluded from further analyses. Therefore, results presented hereafter refer to the 120 respondents that did not pick any dominated alternative.

The fitness of the inferred value functions can be assessed by its “internal error”, i.e., the optimal sum of errors z* provided by the mathematical formulations MP1 and MP2. Let v* denote the inferred value function (defined by the slopes in the optimal solution). A perfect fit between v* and the judgments in J occurs only if the optimal value z* is null.

The fitness of the inferred value functions can also be assessed by its “external error”, i.e., the amount of error if one takes the final ranking of the vehicles in Table 3 (task 2 of the interview) as the correct one. Let \({a}_{[1]}\succ {a}_{\left[2\right]}\succ \cdots \succ {a}_{[6]}\) denote the final ranking (i.e., \({a}_{[j]}\) denotes the j-th best alternative according to the DM). An error is present whenever \({a}_{[x]}\succ {a}_{\left[y\right]}\) and, at the same time, \({v}^{*}\left({a}_{\left[y\right]}\right)\succ {v}^{*}\left({a}_{\left[x\right]}\right)\), according to the inferred value function. A global external error can be defined by the average over all the 15 (5 + 4 + 3 + 2 + 1) pairwise comparisons implicit in the ranking of 6 vehicles:

$$EE=\frac{1}{15}\sum_{x=1}^{5}\sum_{y=x+1}^{6}\underset{}{\mathrm{max}}\left\{{v}^{*}\left({a}_{\left[y\right]}\right)-{v}^{*}\left({a}_{\left[x\right]}\right),0\right\}$$
(4)

Results concerning internal and external error are depicted in Table 4. It should be noted that internal and external errors refer to different judgments and therefore are not comparable.

Table 4 Internal and external error for the different mathematical programs

On average, the internal error is quite small. In 69.2% of the cases the total error does not exceed 0.002 (twice the threshold ε used in MP1/MP2), and in 92.5% of the cases it does not exceed 0.005. Nevertheless, a perfect fit with the additive model is rare. Only 43 (35.8%) cases reproduced J with null error in MP1 (and the same in MP2). In 40 cases (33.3%) total error was 0.002. Further examination of these cases and other cases with higher error reveals an inconsistency of choosing \({a}_{x}\succ {a}_{y}\) in one question for some pair (\({a}_{x},{a}_{y}\)) and choosing \({a}_{y}\succ {a}_{x}\) at a later question. A common example (15% of the respondents) was choosing vehicle A in question 7 and vehicle B in question 11. In both cases, the vehicles (27,000€, 1000 km, 4€/100 km, 150 g/km) and (24,000€, 1300 km, 6€/100 km, 120 g/km) are present. They chose the first vehicle in question 7, even though the second one was present, and the opposite in question 11. A likely explanation is that in question 7 the first vehicle was much cheaper than the third choice available, whereas in question 11 the same vehicle was the most expensive in the set. Most other inconsistencies might be explained by such apparent reluctance in choosing the most expensive vehicle in the triplet. Only in 9 cases (7.5%) the internal error was not attributable to a simple contradiction as illustrated above, exhibiting preferences that cannot be reproduced by an additive model, but might potentially be reproduced by a more general MCDA model [noting for instance that the additive model is a particular case of models such as the Choquet integral and other models allowing interactions among criteria (Beccacece et al. 2015)].

By design, the internal error, i.e., the optimal value z* obtained in MP1 and MP2, does not change in the post-optimization stage, as it becomes a constraint in MP1po and MP2po, respectively. According to the results, constraining the value functions to have typical shapes (MP2) does not increase internal error in general (error increased from MP1 to MP2 in only 1 out of the 120 cases). The corresponding value functions, however, change most of the time (only in 3.3% of the cases the value functions remain unchanged). Comparing the post-optimization solutions, in 89.2% of the cases the optimal value (minimum slope) for MP1po and MP2po is the same, and moreover in 45.0% of the cases both formulations led to the same value functions.

The external error EE concerning the fitness between the inferred value functions and the final ranking (not used in the inference of the value functions) is slightly lower for MP2 compared to MP1 in the initial formulation and very similar in the post-optimal formulations. The error of the post-optimization formulations MP1po and MP2po is noticeably lower compared to the initial formulations, both for MP1 and MP2. Even though on average the error is low, perfect fits were quite rare. The inferred value functions from MP1 and MP1po could reproduce the entire final ranking of six vehicles only in three cases. Using MP2 and MP2po a minor increase was observed (5 and 4 perfect matches, respectively).

4.2 Value function shapes and weights

One can find all sorts of inferred value function shapes among the different respondents, as summarized in Table 5. Considering MP1, the most common shape in the initial formulations is concave (27%), and the least common is convex (12%). Additionally, 17% of the cases display an atypical shape, mainly occurring in criterion Cost. Criterion-wise, vcost tends to be concave (36%) or atypical (31%), vrange tends to be S-shaped (35%) or concave (25%), vmileage also tends to be S-shaped (35%) or concave (29%), and vemissions is mainly linear (45%).

Table 5 Value function shapes

When atypical shapes are excluded in MP2, the concave shape maintains its status as the most common (increasing to 42%) and the number of convex cases also increases much (to 32%). Increases are to be expected, since the atypical shape cases must be distributed among the other possibilities. Somewhat unexpectedly, a side-effect of the disappearance of atypical shapes is the decrease in the number of linear cases and S-shaped cases.

The post-optimization formulations have a marked effect in increasing the number of linear cases, which now become the most common (32% for MP1po and 34% for MP2po), closely followed by S-shaped cases (31% for MP1po and 32% for MP2po). The number of convex cases becomes rather small (5% for MP1po and 6% for MP2po). The shapes do not differ much between MP1po and MP2po. The average value functions (for the 120 respondents) obtained by MP1 and MP2po are depicted in Fig. 3. Comparing the two sets of value functions, despite the similar results between the two sets, it can be observed that cost is the criterion that impacts the most consumer preferences when MP1 is considered, while range is the criterion that has the highest influence on MP2po set of value functions. Globally, emissions is the criterion that influences the least consumer vehicle purchase decisions. When compared to the criteria that consumers intuitively reported were more important in a previous survey to a similar population (Oliveira and Dias 2015), the present results corroborate the importance of cost and the relatively little importance of emissions, but suggest that range matters more than consumers stated.

Fig. 3
figure 3

Average value functions obtained by MP1 (above) and MP2po (below)

The post-optimization formulations MP1po and MP2po are expected to contribute to avoid situations as depicted in Fig. 2 (top), in which one criterion is dominant and other criteria play no role in the additive value function. Recalling that the value of the best performance (\({v}_{k}\left({g}_{k}^{*}\right)\) for a maximization criterion) corresponds to the weight of the value function in the additive model, in the example of Fig. 2 (top) the weight would be null for functions vrange and vemissions.

Table 6 summarizes the difference between the highest and lowest weight among the four criteria considered by one respondent, on average. The post-optimal formulations contribute to reduce to approximately half the weights amplitude (difference between the highest and the lowest weight among the four criteria). On average, this corresponds to decreasing the weight of Cost and increasing the weight of Emissions (Fig. 3). Not only does the mean amplitude of the weights decrease, the standard deviation also decreases. Comparing MP1 with MP2 no clear conclusion emerges.

Table 6 Difference between the highest and the lowest weight inferred for one respondent

4.3 Vehicles chosen

Assuming the final ranking provided by the respondents corresponds to their real preferences, the most preferred alternative would be the HEV vehicle, followed by the Diesel and PHEV options (Table 7). The predictive ability of the different formulations based on the choice-based questions can then be assessed in this perspective, which complements the analysis of the external error in Table 4.

Table 7 Respondent’s choices and mean position in the ranking (final ranking)

Table 8 presents the global value and rank of each vehicle, according to the average value function obtained for each formulation. All formulations are on average well aligned with the respondents’ choices. The post optimal formulations correctly predict the HEV as the top choice of the consumers. Even though these formulations predict the PHEV would be the second choice, the differences between the PHEV and the Diesel vehicles are small both in terms of the respondents mean rank in Table 7 (~ 5%) and in terms of average value in Table 8 (~ 3%).

Table 8 Global value and rank of each vehicle, according to the average inferred value functions

5 Discussion and conclusions

This study experimented with the possibility of using choice-based questionnaires in disaggregation methods in an additive value function framework. It also experimented with a variant to constrain the value functions to have a typical shape and with post-optimization variants increasing the minimum slope of such functions. Data were obtained by interviewing more than a hundred individuals concerning the choice of vehicles with different powertrain technologies. The decision problem addressed was relatively simple in terms of number of criteria (only four), but still quite challenging due to the magnitude of the consequences for a typical individual (a large investment), and due to the need to ponder economic, practical, and environmental concerns.

DMs were found to be frequently inconsistent in choice-based questions, namely choosing \({a}_{x}\succ {a}_{y}\) in one question for some pair (\({a}_{x},{a}_{y}\)) and choosing \({a}_{y}\succ {a}_{x}\) at a later question. When the input is obtained from a set of choice-based questions, intransitivity and inconsistency can occur for several reasons: people do not pay enough attention, they are unable to be fully consistent with a decision rule, they change their mind, or they simply make errors (Korhonen et al. 2012). The existence of a third alternative can influence the focus when comparing the other two, or there can be effects such as considering one as the reference (Morton and Fasolo 2009), and decision makers can focus on a single attribute as a cue or use another simplification heuristic (del Campo et al. 2016). Finally, the order in which questions appear can have an effect (Czajkowski et al. 2014). The literature also discusses that even the presentation of the alternatives in a tabular format might be a poor match for the cognitive style of some respondents (Engin and Vetschera 2017). In most of the inconsistency cases, we observed some reluctance in choosing the most expensive vehicle available in each choice set. However, further research, especially if it involves a debriefing stage, is needed to understand better the choice inconsistencies observed. The use of inconsistency correction methods, a popular research topic in pairwise judgment matrices (e.g., Bozóki et al. 2011), can be used to guide respondents in identifying and possibly correcting their inconsistencies.

Although many DMs were inconsistent, the inferred value function reproduced their 14 choices with a relatively small internal error, and without any error in 35.8% of the cases. The use of the alternative formulation MP2 to avoid atypical value function shapes did not require a “premium”, in terms of increased error.

The interviewed DMs also ranked a set of six vehicles after going through a structured MCDA task and modifying the final result as they wished. Considering this ranking as their real preferences, allowed us to assess the external error of the inferred value functions. Constraining value functions to avoid atypical shapes was slightly beneficial in terms of external error. The main benefit in terms of external error, however, was the use of the post optimization formulations MP1po and MP2po.

Post optimization formulations were also beneficial according to other characteristics. Formulation MP1po was found to be beneficial in reducing the number of value functions with an atypical shape, and both MP1po and MP2po increased the number of linear value functions, which is desirable from an “Occam’s razor” perspective of using the simplest model possible. MP1po and MP2po were also found to be beneficial in reducing the differences in the implicit criteria weights.

Overall, the best performance was thus obtained by the post optimization formulations. Without worsening its internal error, they have more typical shapes and contribute to decrease extreme imbalances among criteria weights. Moreover, they actually improve the external error on average, and the resulting average function yields as the most preferred vehicle the one most respondents ranked first. Concerning these advantages of the post optimization formulations, MP1po and MP2po appear to be similar. The added computational cost of the latter (since it uses binary variables) might be justified only for its ability to exclude atypical value functions and to result in slightly lower imbalances among criteria weights.

The conclusions of this study are necessarily limited by its scope. They are based on a single study concerning a specific situation. Further empirical studies would therefore be welcome, and even more if questions about the effort and comfort of the respondents are included as well (an aspect not included in our questionnaire). Besides studies involving real persons, simulation studies are also of interest for their ability to generate a very large number of instances and to control many design variables, such as the number of criteria. Although challenging, being able to simulate the inconsistencies real people make would add further value to such studies.

Knowing what are the typical value function shapes for each criterion in a given decision context can be extremely useful for different types of research. These shapes can be used as an input for sensitivity and robustness analysis studies addressing unknown value functions, as performed in exact (e.g., Sarabando and Dias 2010) or stochastic analyses (e.g., Lahdelma and Salminen 2001). Although the specific value function might be unknown, one may wish to simulate specific value function types (Dias and Vetschera 2019a, b). They can also be useful in studying the behavior of different methods on simulated data (Vetschera et al. 2014a; Mihelčić and Bohanec 2017; Dias and Vetschera 2019b). Finally, they can be used in studies aiming to forecast the diffusion of new products and technologies using system dynamics or agent based models [e.g., in the context of alternative fuels (Stummer et al. 2015; Oliveira et al. 2019)].